[
https://issues.apache.org/jira/browse/ZOOKEEPER-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419776#comment-13419776
]
Hadoop QA commented on ZOOKEEPER-1514:
--------------------------------------
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12537441/ZOOKEEPER-1514.patch
against trunk revision 1362660.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 9 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac
compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.
+1 release audit. The applied patch does not increase the total number of
release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results:
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1142//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1142//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1142//console
This message is automatically generated.
> FastLeaderElection - leader ignores the round information when joining a
> quorum
> -------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-1514
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1514
> Project: ZooKeeper
> Issue Type: Bug
> Components: quorum
> Affects Versions: 3.3.4
> Reporter: Patrick Hunt
> Assignee: Flavio Junqueira
> Priority: Critical
> Fix For: 3.4.4, 3.5.0, 3.3.7
>
> Attachments: ZOOKEEPER-1514.patch, ZOOKEEPER-1514.patch
>
>
> In the following case we have a 3 server ensemble.
> Initially all is well, zk3 is the leader.
> However zk3 fails, restarts, and rejoins the quorum as the new leader (was
> the old leader, still the leader after re-election)
> The existing two followers, zk1 and zk2 rejoin the new quorum again as
> followers of zk3.
> zk1 then fails, the datadirectory is deleted (so it has no state whatsoever)
> and restarted. However zk1 can never rejoin the quorum (even after an hour).
> During this time zk2 and zk3 are serving properly.
> Later all three servers are later restarted and properly form a functional
> quourm.
> Here are some interesting log snippets. Nothing else of interest was seen in
> the logs during this time:
> zk3. This is where it becomes the leader after failing initially (as the
> leader). Notice the "round" is ahead of zk1 and zk2:
> {noformat}
> 2012-07-18 17:19:35,423 - INFO
> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@663] - New election. My id = 3,
> Proposed zxid = 77309411648
> 2012-07-18 17:19:35,423 - INFO [WorkerReceiver
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 77309411648
> (n.zxid), 832 (n.round), LOOKING (n.state), 3 (n.sid), LOOKING (my state)
> 2012-07-18 17:19:35,424 - INFO [WorkerReceiver
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 73014444480
> (n.zxid), 831 (n.round), FOLLOWING (n.state), 2 (n.sid), LOOKING (my state)
> 2012-07-18 17:19:35,424 - INFO [WorkerReceiver
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 73014444480
> (n.zxid), 831 (n.round), FOLLOWING (n.state), 1 (n.sid), LOOKING (my state)
> 2012-07-18 17:19:35,424 - INFO [QuorumPeer:/0.0.0.0:2181:QuorumPeer@655] -
> LEADING
> {noformat}
> zk1 which won't come back. Notice that zk3 is reporting the round as 831,
> while zk2 thinks that the round is 832:
> {noformat}
> 2012-07-18 17:31:12,015 - INFO [WorkerReceiver
> Thread:FastLeaderElection@496] - Notification: 1 (n.leader), 77309411648
> (n.zxid), 1 (n.round), LOOKING (n.state), 1 (n.sid), LOOKING (my state)
> 2012-07-18 17:31:12,016 - INFO [WorkerReceiver
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 73014444480
> (n.zxid), 831 (n.round), LEADING (n.state), 3 (n.sid), LOOKING (my state)
> 2012-07-18 17:31:12,017 - INFO [WorkerReceiver
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 77309411648
> (n.zxid), 832 (n.round), FOLLOWING (n.state), 2 (n.sid), LOOKING (my state)
> 2012-07-18 17:31:15,219 - INFO
> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@697] - Notification time out:
> 6400
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira