Glad it's working so far. If you do see any issues, please let us know and/or file a JIRA.
By the way, since you're using Avro 1.5.2 with Netty you can now take advantage of asynchronous RPCs if this suits your use case. I have some sample code out here: https://github.com/jbaldassari/Avro-RPC -James On Wed, Sep 14, 2011 at 3:06 PM, Yang <[email protected]> wrote: > yeah, I found i was actually using 1.5.1... > updated to 1.5.2 , now it works fine so far after 1 hour > > Thanks a lot! > Yang > > On Wed, Sep 14, 2011 at 10:56 AM, James Baldassari > <[email protected]> wrote: > > It appears to be pre-1.5.2 from this part of the stack trace: > > > > at java.util.concurrent.Semaphore.acquire(Semaphore.java:313) > > at > > > org.apache.avro.ipc.NettyTransceiver$CallFuture.get(NettyTransceiver.java:203) > > > > CallFuture was moved out of NettyTransceiver as part of AVRO-539 and is > now > > a stand-alone class. Also the Semaphore inside CallFuture was replaced > with > > a CountDownLatch, so in 1.5.2 and later we should never see CallFuture > > waiting on a Semaphore. > > > > From your initial description it appears that some temporary network > > disruption might have caused the connection between the client and server > to > > close, and then the client never recovered from this situation. This > > doesn't surprise me because I don't think the pre-1.5.2 NettyTransceiver > had > > any way to recover from a connection failure. While working on AVRO-539 > I > > modified the transceiver code such that it would attempt to re-establish > the > > connection if the connection was lost, so that's why I think this may > help > > you. Just a guess though. But like I said, since the code has changed > so > > much in 1.5.2 and later, it will be much easier to figure out what's > wrong > > (and fix it if necessary) if you can reproduce it using 1.5.2 or later. > > > > -James > > > > > > On Wed, Sep 14, 2011 at 1:39 PM, Yang <[email protected]> wrote: > >> > >> thanks James: > >> > >> I *think* I'm using 1.5.2, but could check to be sure. > >> how do you determine that it is a pre-1.5.2 version? > >> > >> Yang > >> > >> On Wed, Sep 14, 2011 at 10:25 AM, James Baldassari > >> <[email protected]> wrote: > >> > Hi Yang, > >> > > >> > From the stack trace you posted it appears that you are using a > version > >> > of > >> > Avro prior to 1.5.2. Which version are you using? There have been a > >> > number > >> > of significant changes recently to the RPC framework and the Netty > >> > implementation in particular. Could you please try to reproduce the > >> > problem > >> > using Avro 1.5.2 or newer? The problem may be resolved with an > >> > upgrade. If > >> > the problem still exists in the newer versions, it will be a lot > easier > >> > to > >> > diagnose/fix it if we can see stack traces from a post-1.5.2 version. > >> > > >> > Thanks, > >> > James > >> > > >> > > >> > On Wed, Sep 14, 2011 at 1:08 PM, Yang <[email protected]> wrote: > >> >> > >> >> I'm always seeing these "channel closed " exceptions , with low > >> >> probability, i.e. about every 10 hours under heavy load. > >> >> > >> >> I'm not sure if it's the server that got the channel closed or the > >> >> client, so I included the exception stack from both sides. > >> >> anybody has an idea how to debug this? > >> >> > >> >> also, let's say it does have a valid reason for closing this, what is > >> >> my strategy of coping with this? I originally have many > >> >> senders, due to the channel close exception, many of them died, after > >> >> this, only 2 application threads remain, but they > >> >> all seem blocked on trying to grab a connection from Netty's pool, so > >> >> even if I create new sender threads, it seems they would still > >> >> block. so how can I tell netty to "reset/replenish " its connections? > >> >> > >> >> > >> >> Thanks a lot > >> >> Yang > >> >> > >> >> > >> >> client side: > >> >> > >> >> > >> >> > >> >> WARN 16:51:02,079 Unexpected exception from downstream. > >> >> java.nio.channels.ClosedChannelException > >> >> at > >> >> > >> >> > org.jboss.netty.channel.socket.nio.NioWorker.cleanUpWriteBuffer(NioWorker.java:636) > >> >> at > >> >> > >> >> > org.jboss.netty.channel.socket.nio.NioWorker.writeFromUserCode(NioWorker.java:369) > >> >> at > >> >> > >> >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:117) > >> >> at org.jboss.netty.channel.Channels.write(Channels.java:632) > >> >> at > >> >> > >> >> > org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:70) > >> >> at org.jboss.netty.channel.Channels.write(Channels.java:611) > >> >> at org.jboss.netty.channel.Channels.write(Channels.java:578) > >> >> at > >> >> > org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:259) > >> >> at > >> >> > >> >> > org.apache.avro.ipc.NettyTransceiver.transceive(NettyTransceiver.java:131) > >> >> at org.apache.avro.ipc.Requestor.request(Requestor.java:134) > >> >> at > >> >> > >> >> > org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:55) > >> >> at $Proxy0.collect_ad_click(Unknown Source) > >> >> > >> >> > >> >> > >> >> server side: > >> >> > >> >> > >> >> WARN 16:51:01,939 Unexpected exception from downstream. > >> >> java.io.IOException: Broken pipe > >> >> at sun.nio.ch.FileDispatcher.write0(Native Method) > >> >> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) > >> >> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:122) > >> >> at sun.nio.ch.IOUtil.write(IOUtil.java:78) > >> >> at > >> >> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:352) > >> >> at > >> >> > >> >> > org.jboss.netty.channel.socket.nio.SocketSendBufferPool$PooledSendBuffer.transferTo(SocketSendBufferPool.java:239) > >> >> at > >> >> > org.jboss.netty.channel.socket.nio.NioWorker.write0(NioWorker.java:469) > >> >> at > >> >> > >> >> > org.jboss.netty.channel.socket.nio.NioWorker.writeFromUserCode(NioWorker.java:387) > >> >> at > >> >> > >> >> > org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:137) > >> >> at > >> >> > >> >> > org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:76) > >> >> at org.jboss.netty.channel.Channels.write(Channels.java:632) > >> >> at > >> >> > >> >> > org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:70) > >> >> at org.jboss.netty.channel.Channels.write(Channels.java:611) > >> >> at org.jboss.netty.channel.Channels.write(Channels.java:578) > >> >> at > >> >> > org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:259) > >> >> at > >> >> > >> >> > org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:137) > >> >> at > >> >> > >> >> > org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:120) > >> >> at > >> >> > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:302) > >> >> at > >> >> > >> >> > org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:317) > >> >> at > >> >> > >> >> > org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:299) > >> >> at > >> >> > >> >> > org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:216) > >> >> at > >> >> > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274) > >> >> at > >> >> > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261) > >> >> at > >> >> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349) > >> >> at > >> >> > >> >> > org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280) > >> >> at > >> >> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200) > >> >> at > >> >> > >> >> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > >> >> at > >> >> > >> >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > >> >> at java.lang.Thread.run(Thread.java:679) > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> after exception, the client sender's state, blocked : > >> >> > >> >> > >> >> "EventQueue-Sender5" prio=10 tid=0x00007f5a0c2f5000 nid=0x6e3b > waiting > >> >> on condition [0x00007f5a19519000] > >> >> java.lang.Thread.State: WAITING (parking) > >> >> at sun.misc.Unsafe.park(Native Method) > >> >> - parking to wait for <0x00000007510e00c0> (a > >> >> java.util.concurrent.Semaphore$NonfairSync) > >> >> at > >> >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > >> >> at > >> >> > >> >> > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > >> >> at > >> >> > >> >> > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998) > >> >> at > >> >> > >> >> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > >> >> at java.util.concurrent.Semaphore.acquire(Semaphore.java:313) > >> >> at > >> >> > >> >> > org.apache.avro.ipc.NettyTransceiver$CallFuture.get(NettyTransceiver.java:203) > >> >> at > >> >> > >> >> > org.apache.avro.ipc.NettyTransceiver.transceive(NettyTransceiver.java:133) > >> >> at org.apache.avro.ipc.Requestor.request(Requestor.java:134) > >> >> - locked <0x0000000757144220> (a > >> >> org.apache.avro.ipc.specific.SpecificRequestor) > >> >> at > >> >> > >> >> > org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:55) > >> >> at $Proxy0.collect_ad_click(Unknown Source) > >> >> at > >> >> > >> >> > com.cgm.whisky.emitter.ConnectionPool$EventsCollectorWithSerial.collect_ad_click(ConnectionPool.java:60) > >> > > >> > > > > > >
