[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039364#comment-13039364
 ] 

Vishal Kathuria commented on ZOOKEEPER-1075:
--------------------------------------------

The issue I think is this code below in FastLeaderElection.java
/**
                         * Before joining an established ensemble, verify that
                         * a majority are following the same leader.
                         */
                        outofelection.put(n.sid, new Vote(n.leader, n.zxid,
                                n.epoch, n.state));
                        if (termPredicate(outofelection, new Vote(n.leader,
                                n.zxid, n.epoch, n.state))
                                && checkLeader(outofelection, n.leader, 
n.epoch)) {



In the case above, there is only one entry in outofelection that does not 
constitute the majority. What we really need to check is whether 
outofelection.size() + 1(this server) forms a majority because once this server 
accepts the leader, the leader will have a majority of followers.

> Zookeeper Server cannot join an existing ensemble if the existing ensemble 
> doesn't already have a quorum
> --------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1075
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1075
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: leaderElection
>    Affects Versions: 3.3.2
>         Environment: Windows 7
>            Reporter: Vishal Kathuria
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Here is the sequence of steps that reproduces the problem.
> On a 3 server ensemble,
> 1. Bring up two servers (say 1 and 2). Lets say 1 is leading.
> 2. Bring down 2
> 3. Bring up 2. 
> 4. 2 gets a notification from 1 that it is leading but 2 doesn't accept it as 
> a leader since it cannot find one other node that thinks 1 is the leader.
> So the ensemble gets stuck where 2 isn't following. If at this point, 3 comes 
> up, then one of 2 & 3 will become a leader and 1 will keep thinking it is the 
> leader.
> I am working on a patch to fix this issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to