[ https://issues.apache.org/jira/browse/ZOOKEEPER-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651503#comment-13651503 ]
Patrick Hunt commented on ZOOKEEPER-1697: ----------------------------------------- I've attached a patch for trunk & 3.4 that I think will fix the problem. All tests pass on branch34, however I'm running into ZOOKEEPER-1700 on trunk and can't get the tests to run cleanly (only FLETest.testJoin is failing on trunk, but that fails with or without this patch applied) Please take a look and let me know if this patch makes sense to you for both 3.4 and trunk. > large snapshots can cause continuous quorum failure > --------------------------------------------------- > > Key: ZOOKEEPER-1697 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1697 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.4.3 > Reporter: Patrick Hunt > Assignee: Patrick Hunt > Priority: Critical > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1697_branch34.patch, ZOOKEEPER-1697.patch > > > I keep seeing this on the leader: > 2013-04-30 01:18:39,754 INFO > org.apache.zookeeper.server.quorum.Leader: Shutdown called > java.lang.Exception: shutdown Leader! reason: Only 0 followers, need 2 > at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:447) > at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:422) > at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:753) > The followers are downloading the snapshot when this happens, and are > trying to do their first ACK to the leader, the ack fails with broken > pipe. > In this case the snapshots are large and the config has increased the > initLimit. syncLimit is small - 10 or so with ticktime of 2000. Note > this is 3.4.3 with ZOOKEEPER-1521 applied. > I originally speculated that > https://issues.apache.org/jira/browse/ZOOKEEPER-1521 might be related. > I thought I might have broken something for this environment. That > doesn't look to be the case. > As it looks now it seems that 1521 didn't go far enough. The leader > verifies that all followers have ACK'd to the leader within the last > "syncLimit" time period. This runs all the time in the background on > the leader to identify the case where a follower drops. In this case > the followers take so long to load the snapshot that this check fails > the very first time, as a result the leader drops (not enough ack'd > followers w/in the sync limit) and re-election happens. This repeats > forever. (the above error) > this is the call: > org.apache.zookeeper.server.quorum.LearnerHandler.synced() that's at > odds. > look at setting of tickOfLastAck in > org.apache.zookeeper.server.quorum.LearnerHandler.run() > It's not set until the follower first acks - in this case I can see > that the followers are not getting to the ack prior to the leader > shutting down due to the error log above. > It seems that sync() should probably use the init limit until the > first ack comes in from the follower. I also see that while tickOfLastAck and > leader.self.tick is shared btw two threads there is no synchronization of the > shared resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira