[ https://issues.apache.org/jira/browse/ZOOKEEPER-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144237#comment-13144237 ]
Flavio Junqueira commented on ZOOKEEPER-1270: --------------------------------------------- Here is some progress. I was actually looking at the wrong snippet. The correct one was the NEWLEADER handler: {noformat} case Leader.NEWLEADER: // it will be NEWLEADER in v1.0 zk.takeSnapshot(); snapshotTaken = true; writePacket(new QuorumPacket(Leader.ACK, newLeaderZxid, null, null), true); break; } {noformat} We also take a snapshot here and by looking at the stack trace that Pat posted, we see that the learner handlers are stuck in the loop right after receiving the ack, which essentially waits for the leader to start. By the same stack trace, the leader is not starting because it is waiting for the followers to acknowledge the NEWLEADER message... but the followers have acknowledged the NEWLEADER message, otherwise the learner handlers wouldn't be executing that loop (Line 450). Unless I'm missing anything, the problem must be in Leader.processAck. > testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving. > ----------------------------------------------------------------------------- > > Key: ZOOKEEPER-1270 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1270 > Project: ZooKeeper > Issue Type: Bug > Components: server > Reporter: Patrick Hunt > Priority: Blocker > Fix For: 3.4.0, 3.5.0 > > Attachments: ZOOKEEPER-1270tests.patch, ZOOKEEPER-1270tests2.patch, > testEarlyLeaderAbandonment.txt.gz, testEarlyLeaderAbandonment2.txt.gz, > testEarlyLeaderAbandonment3.txt.gz > > > Looks pretty serious - quorum is formed but no clients can attach. Will > attach logs momentarily. > This test was introduced in the following commit (all three jira commit at > once): > ZOOKEEPER-335. zookeeper servers should commit the new leader txn to their > logs. > ZOOKEEPER-1081. modify leader/follower code to correctly deal with new leader > ZOOKEEPER-1082. modify leader election to correctly take into account current -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira