[ https://issues.apache.org/jira/browse/ZOOKEEPER-140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Flavio Paiva Junqueira resolved ZOOKEEPER-140. ---------------------------------------------- Resolution: Fixed This issue has been resolved by the patch of 127, which has been committed. > Deadlock in QuorumCnxManager > ---------------------------- > > Key: ZOOKEEPER-140 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-140 > Project: Zookeeper > Issue Type: Bug > Reporter: Flavio Paiva Junqueira > > Frequently the servers deadlock in QuorumCnxManager:initiateConnection on > s.read(msgBuffer) when reading the challenge from the peer. > Calls to initiateConnection and receiveConnection are synchronized, so only > one or the other can be executing at a time. This prevents two connections > from opening between the same pair of servers. > However, it seems that this leads to deadlock, as in this scenario: > {noformat} > A (initiate --> B) > B (initiate --> C) > C (initiate --> A) > {noformat} > initiateConnection can only complete when receiveConnection runs on the > remote peer and answers the challenge. If all servers are blocked in > initiateConnection, receiveConnection never runs and leader election halts. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.