[
https://issues.apache.org/jira/browse/ZOOKEEPER-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424286#comment-13424286
]
Henry Robinson commented on ZOOKEEPER-1514:
-------------------------------------------
Flavio - this looks fine. The point I am trying to make about this bit of code:
{code}
if(listener != null){
listener.start();
} else {
LOG.error("Null listener when initializing cnx manager");
Assert.fail("Failed to create cnx manager");
}
{code}
is that there's no need for the null check, since if {{listener}} is null,
there'll be an NPE thrown which will fail the test anyhow. Plus, looking at
{{QuorumCnxManager.java:153}}, I can't see any way in which {{listener}} can be
null, because it's unambiguously assigned to a {{new Listener()}}. Is there a
case that I'm missing?
I know this doesn't really affect the functionality of the patch, but if these
checks aren't necessary, it will be confusing to the reader in the future.
> FastLeaderElection - leader ignores the round information when joining a
> quorum
> -------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-1514
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1514
> Project: ZooKeeper
> Issue Type: Bug
> Components: quorum
> Affects Versions: 3.3.4
> Reporter: Patrick Hunt
> Assignee: Flavio Junqueira
> Priority: Critical
> Fix For: 3.4.4, 3.5.0, 3.3.7
>
> Attachments: ZOOKEEPER-1514.patch, ZOOKEEPER-1514.patch,
> ZOOKEEPER-1514.patch
>
>
> In the following case we have a 3 server ensemble.
> Initially all is well, zk3 is the leader.
> However zk3 fails, restarts, and rejoins the quorum as the new leader (was
> the old leader, still the leader after re-election)
> The existing two followers, zk1 and zk2 rejoin the new quorum again as
> followers of zk3.
> zk1 then fails, the datadirectory is deleted (so it has no state whatsoever)
> and restarted. However zk1 can never rejoin the quorum (even after an hour).
> During this time zk2 and zk3 are serving properly.
> Later all three servers are later restarted and properly form a functional
> quourm.
> Here are some interesting log snippets. Nothing else of interest was seen in
> the logs during this time:
> zk3. This is where it becomes the leader after failing initially (as the
> leader). Notice the "round" is ahead of zk1 and zk2:
> {noformat}
> 2012-07-18 17:19:35,423 - INFO
> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@663] - New election. My id = 3,
> Proposed zxid = 77309411648
> 2012-07-18 17:19:35,423 - INFO [WorkerReceiver
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 77309411648
> (n.zxid), 832 (n.round), LOOKING (n.state), 3 (n.sid), LOOKING (my state)
> 2012-07-18 17:19:35,424 - INFO [WorkerReceiver
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 73014444480
> (n.zxid), 831 (n.round), FOLLOWING (n.state), 2 (n.sid), LOOKING (my state)
> 2012-07-18 17:19:35,424 - INFO [WorkerReceiver
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 73014444480
> (n.zxid), 831 (n.round), FOLLOWING (n.state), 1 (n.sid), LOOKING (my state)
> 2012-07-18 17:19:35,424 - INFO [QuorumPeer:/0.0.0.0:2181:QuorumPeer@655] -
> LEADING
> {noformat}
> zk1 which won't come back. Notice that zk3 is reporting the round as 831,
> while zk2 thinks that the round is 832:
> {noformat}
> 2012-07-18 17:31:12,015 - INFO [WorkerReceiver
> Thread:FastLeaderElection@496] - Notification: 1 (n.leader), 77309411648
> (n.zxid), 1 (n.round), LOOKING (n.state), 1 (n.sid), LOOKING (my state)
> 2012-07-18 17:31:12,016 - INFO [WorkerReceiver
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 73014444480
> (n.zxid), 831 (n.round), LEADING (n.state), 3 (n.sid), LOOKING (my state)
> 2012-07-18 17:31:12,017 - INFO [WorkerReceiver
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 77309411648
> (n.zxid), 832 (n.round), FOLLOWING (n.state), 2 (n.sid), LOOKING (my state)
> 2012-07-18 17:31:15,219 - INFO
> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@697] - Notification time out:
> 6400
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira