[
https://issues.apache.org/jira/browse/AVRO-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221055#comment-13221055
]
James Baldassari commented on AVRO-1027:
----------------------------------------
All my tests are passing with trunk + AVRO-1027
> NettyTransceiver will deadlock when attempting transceive/disconnect on the
> same thread
> ---------------------------------------------------------------------------------------
>
> Key: AVRO-1027
> URL: https://issues.apache.org/jira/browse/AVRO-1027
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.6.1
> Reporter: Simon Wilkinson
> Assignee: James Baldassari
> Fix For: 1.6.3
>
> Attachments: AVRO-1027-v2.patch, AVRO-1027.patch
>
>
> If an Exception is caught while trying to write to a Channel, Netty can
> deliver the Exception to a ChannelUpstreamHandler on the same thread that
> attempted to write to the Channel. If this occurs with the
> NettyClientAvroHandler implementation of ChannelUpstreamHandler then the
> thread will deadlock.
> Specifically, NettyClientAvroHandler overrides the
> ChannelUpstreamHandler.exceptionCaught() method to perform a disconnect,
> which requires the NettyTransceiver's write lock. However, in the above
> situation, the thread will already have locked the NettyTransceiver's read
> lock to write to the Channel. ReentrantReadWriteLock does not allow upgrading
> from a read to a write lock, hence the thread deadlocks.
> Example stack trace (simplified):
> "SessionManager-TimeoutPoller" prio=10 tid=0x7b689c00 nid=0x375d waiting on
> condition [0x7b0ad000..0x7b0ade70]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0xf2a944d8> (a
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
> at
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:807)
> >>> [Acquire write lock] at
> >>> org.apache.avro.ipc.NettyTransceiver.disconnect(NettyTransceiver.java:285)
> at
> org.apache.avro.ipc.NettyTransceiver.access$2(NettyTransceiver.java:281)
> at
> org.apache.avro.ipc.NettyTransceiver$NettyClientAvroHandler.exceptionCaught(NettyTransceiver.java:499)
> at
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:122)
> at
> org.apache.avro.ipc.NettyTransceiver$NettyClientAvroHandler.handleUpstream(NettyTransceiver.java:473)
> at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
> at
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:783)
> at
> org.jboss.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:238)
> at
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:122)
> at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
> at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
> at org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:432)
> at
> org.jboss.netty.channel.socket.nio.NioWorker.cleanUpWriteBuffer(NioWorker.java:661)
> at
> org.jboss.netty.channel.socket.nio.NioWorker.writeFromUserCode(NioWorker.java:372)
> at
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:117)
> at
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:771)
> at org.jboss.netty.channel.Channels.write(Channels.java:632)
> at
> org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:70)
> at
> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
> at
> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582)
> at org.jboss.netty.channel.Channels.write(Channels.java:611)
> at org.jboss.netty.channel.Channels.write(Channels.java:578)
> at org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:251)
> >>> [Acquire read lock] at
> >>> org.apache.avro.ipc.NettyTransceiver.writeDataPack(NettyTransceiver.java:413)
> >>> [Acquire read lock] at
> >>> org.apache.avro.ipc.NettyTransceiver.transceive(NettyTransceiver.java:394)
> at org.apache.avro.ipc.Requestor.request(Requestor.java:147)
> at org.apache.avro.ipc.Requestor.request(Requestor.java:129)
> at
> org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:68)
> <snip>
> Note, in Avro 1.6.1 the read lock is acquired in both
> NettyTransceiver.transceive() and NettyTransceiver.writeDataPack(). AVRO-1013
> fixes this so that it is acquired only once in NettyTransceiver.transceive().
> I've attached a patch that demonstrates a potential fix for the deadlock; the
> patch assumes that AVRO-1013 has also been applied.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira