[jira] [Commented] (ZOOKEEPER-1514) FastLeaderElection - leader ignores the round information when joining a quorum

Henry Robinson (JIRA) Fri, 27 Jul 2012 23:43:39 -0700

    [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424286#comment-13424286
 ]


Henry Robinson commented on ZOOKEEPER-1514:
-------------------------------------------

Flavio - this looks fine. The point I am trying to make about this bit of code:

{code}
  if(listener != null){
    listener.start();
  } else {
    LOG.error("Null listener when initializing cnx manager");
    Assert.fail("Failed to create cnx manager");
  }
{code}

is that there's no need for the null check, since if {{listener}} is null, 
there'll be an NPE thrown which will fail the test anyhow. Plus, looking at 
{{QuorumCnxManager.java:153}}, I can't see any way in which {{listener}} can be 
null, because it's unambiguously assigned to a {{new Listener()}}. Is there a 
case that I'm missing?

I know this doesn't really affect the functionality of the patch, but if these 
checks aren't necessary, it will be confusing to the reader in the future. 
                
> FastLeaderElection - leader ignores the round information when joining a 
> quorum
> -------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1514
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1514
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.3.4
>            Reporter: Patrick Hunt
>            Assignee: Flavio Junqueira
>            Priority: Critical
>             Fix For: 3.4.4, 3.5.0, 3.3.7
>
>         Attachments: ZOOKEEPER-1514.patch, ZOOKEEPER-1514.patch, 
> ZOOKEEPER-1514.patch
>
>
> In the following case we have a 3 server ensemble.
> Initially all is well, zk3 is the leader.
> However zk3 fails, restarts, and rejoins the quorum as the new leader (was 
> the old leader, still the leader after re-election)
> The existing two followers, zk1 and zk2 rejoin the new quorum again as 
> followers of zk3.
> zk1 then fails, the datadirectory is deleted (so it has no state whatsoever) 
> and restarted. However zk1 can never rejoin the quorum (even after an hour). 
> During this time zk2 and zk3 are serving properly.
> Later all three servers are later restarted and properly form a functional 
> quourm.
> Here are some interesting log snippets. Nothing else of interest was seen in 
> the logs during this time:
> zk3. This is where it becomes the leader after failing initially (as the 
> leader). Notice the "round" is ahead of zk1 and zk2:
> {noformat}
> 2012-07-18 17:19:35,423 - INFO  
> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@663] - New election. My id =  3, 
> Proposed zxid = 77309411648
> 2012-07-18 17:19:35,423 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 77309411648 
> (n.zxid), 832 (n.round), LOOKING (n.state), 3 (n.sid), LOOKING (my state)
> 2012-07-18 17:19:35,424 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 73014444480 
> (n.zxid), 831 (n.round), FOLLOWING (n.state), 2 (n.sid), LOOKING (my state)
> 2012-07-18 17:19:35,424 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 73014444480 
> (n.zxid), 831 (n.round), FOLLOWING (n.state), 1 (n.sid), LOOKING (my state)
> 2012-07-18 17:19:35,424 - INFO  [QuorumPeer:/0.0.0.0:2181:QuorumPeer@655] - 
> LEADING
> {noformat}
> zk1 which won't come back. Notice that zk3 is reporting the round as 831, 
> while zk2 thinks that the round is 832:
> {noformat}
> 2012-07-18 17:31:12,015 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 1 (n.leader), 77309411648 
> (n.zxid), 1 (n.round), LOOKING (n.state), 1 (n.sid), LOOKING (my state)
> 2012-07-18 17:31:12,016 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 73014444480 
> (n.zxid), 831 (n.round), LEADING (n.state), 3 (n.sid), LOOKING (my state)
> 2012-07-18 17:31:12,017 - INFO  [WorkerReceiver 
> Thread:FastLeaderElection@496] - Notification: 3 (n.leader), 77309411648 
> (n.zxid), 832 (n.round), FOLLOWING (n.state), 2 (n.sid), LOOKING (my state)
> 2012-07-18 17:31:15,219 - INFO  
> [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@697] - Notification time out: 
> 6400
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1514) FastLeaderElection - leader ignores the round information when joining a quorum

Reply via email to