[
https://issues.apache.org/jira/browse/CASSANDRA-392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jun Rao updated CASSANDRA-392:
------------------------------
Attachment: issue392.patchv2
In SelectorManager.doProcess(), I don't see why we need to synchronize on each
selection key any more. Within each of the process such as connect, read and
write, we already synchronize on the selection key through
turnOnInterestOps/turnOffInterestOps (which only holds a short lock on a
selection key).
Attached is a patch that removes the selection key synchronization in
SelectorManager(). Sammy, could you give it a try and see it works?
> Deadlock with SelectorManager.doProcess and TcpConnection.write
> ---------------------------------------------------------------
>
> Key: CASSANDRA-392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-392
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.4
> Reporter: Sammy Yu
> Assignee: Sammy Yu
> Fix For: 0.4
>
> Attachments:
> 0001-CASSANDRA-392-Moved-turnOnInterestOps-outside-of-syn.patch,
> issue392.patchv2
>
>
> We ran into a deadlock last night:
> Name: MESSAGE-SERIALIZER-POOL:2
> State: BLOCKED on sun.nio.ch.selectionkeyi...@2e257f1b owned by: TCP Selector
> Manager
> Total blocked: 1 Total waited: 1
> Stack trace:
> org.apache.cassandra.net.SelectionKeyHandler.turnOnInterestOps(SelectionKeyHandler.java:73)
> org.apache.cassandra.net.TcpConnection.write(TcpConnection.java:186)
> - locked org.apache.cassandra.net.tcpconnect...@5ab9f791
> org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:67)
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> java.lang.Thread.run(Thread.java:619)
> Name: TCP Selector Manager
> State: BLOCKED on org.apache.cassandra.net.tcpconnect...@5ab9f791 owned by:
> MESSAGE-SERIALIZER-POOL:2
> Total blocked: 2 Total waited: 0
> Stack trace:
> org.apache.cassandra.net.TcpConnection.connect(TcpConnection.java:360)
> org.apache.cassandra.net.SelectorManager.doProcess(SelectorManager.java:131)
> - locked sun.nio.ch.selectionkeyi...@2e257f1b
> org.apache.cassandra.net.SelectorManager.run(SelectorManager.java:98)
> The SelectionManager.doProcess acquires a monitor on the SelectionKey and
> then calls methods such as TcpConnection.connect(SelectionKey key) which
> obtains a monitor for the TcpConnection object itself. Another task eg:
> MessageSerializationTask can come along and call write(Message message) which
> obtains a monitor for the TCPConnection first and then on calls to
> turnOnInterestOps tries to obtain the monitor for the SelectionKey which
> causes the deadlock.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.