NettyTransceiver will deadlock when attempting transceive/disconnect on the 
same thread
---------------------------------------------------------------------------------------

                 Key: AVRO-1027
                 URL: https://issues.apache.org/jira/browse/AVRO-1027
             Project: Avro
          Issue Type: Bug
          Components: java
    Affects Versions: 1.6.1
            Reporter: Simon Wilkinson


If an Exception is caught while trying to write to a Channel, Netty can deliver 
the Exception to a ChannelUpstreamHandler on the same thread that attempted to 
write to the Channel. If this occurs with the NettyClientAvroHandler 
implementation of ChannelUpstreamHandler then the thread will deadlock.

Specifically, NettyClientAvroHandler overrides the 
ChannelUpstreamHandler.exceptionCaught() method to perform a disconnect, which 
requires the NettyTransceiver's write lock. However, in the above situation, 
the thread will already have locked the NettyTransceiver's read lock to write 
to the Channel. ReentrantReadWriteLock does not allow upgrading from a read to 
a write lock, hence the thread deadlocks.

Example stack trace (simplified):

"SessionManager-TimeoutPoller" prio=10 tid=0x7b689c00 nid=0x375d waiting on 
condition [0x7b0ad000..0x7b0ade70]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0xf2a944d8> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
    at 
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:807)
>>> [Acquire write lock] at 
>>> org.apache.avro.ipc.NettyTransceiver.disconnect(NettyTransceiver.java:285)
    at org.apache.avro.ipc.NettyTransceiver.access$2(NettyTransceiver.java:281)
    at 
org.apache.avro.ipc.NettyTransceiver$NettyClientAvroHandler.exceptionCaught(NettyTransceiver.java:499)
    at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:122)
    at 
org.apache.avro.ipc.NettyTransceiver$NettyClientAvroHandler.handleUpstream(NettyTransceiver.java:473)
    at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
    at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:783)
    at 
org.jboss.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:238)
    at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:122)
    at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
    at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
    at org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:432)
    at 
org.jboss.netty.channel.socket.nio.NioWorker.cleanUpWriteBuffer(NioWorker.java:661)
    at 
org.jboss.netty.channel.socket.nio.NioWorker.writeFromUserCode(NioWorker.java:372)
    at 
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:117)
    at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:771)
    at org.jboss.netty.channel.Channels.write(Channels.java:632)
    at 
org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:70)
    at 
org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
    at 
org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582)
    at org.jboss.netty.channel.Channels.write(Channels.java:611)
    at org.jboss.netty.channel.Channels.write(Channels.java:578)
    at org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:251)
>>> [Acquire read lock] at 
>>> org.apache.avro.ipc.NettyTransceiver.writeDataPack(NettyTransceiver.java:413)
>>> [Acquire read lock] at 
>>> org.apache.avro.ipc.NettyTransceiver.transceive(NettyTransceiver.java:394)
    at org.apache.avro.ipc.Requestor.request(Requestor.java:147)
    at org.apache.avro.ipc.Requestor.request(Requestor.java:129)
    at 
org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:68)
    <snip>


Note, in Avro 1.6.1 the read lock is acquired in both 
NettyTransceiver.transceive() and NettyTransceiver.writeDataPack(). AVRO-1013 
fixes this so that it is acquired only once in NettyTransceiver.transceive().

I've attached a patch that demonstrates a potential fix for the deadlock; the 
patch assumes that AVRO-1013 has also been applied.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to