Tanakorn Leesatapornwongsa created ZOOKEEPER-1912:
-----------------------------------------------------

             Summary: Leader election lets 2 leaders happen
                 Key: ZOOKEEPER-1912
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1912
             Project: ZooKeeper
          Issue Type: Bug
          Components: leaderElection
    Affects Versions: 3.4.6
         Environment: Ubuntu 12.04, OpenJDK 1.6
            Reporter: Tanakorn Leesatapornwongsa
            Priority: Minor


In 3-node cluster, when there are 2 nodes die and reboot during leader 
election, it might lead to the case that there are 2 leaders happen in the 
system. Eventually, a leader that does not has follower supports and quit being 
leader, but it makes us lose some availability.

I am building a tools that can reorder messages and disk write, and also inject 
node crash to the system and found this bug.
These are the step of events that lead to 2 leaders at the end. My zookeeper 
nodes have id = 0,1,2

packetsend from=0 to=1 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=0 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=1 to=0 state=0 leader=1 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=1 to=2 state=0 leader=1 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=1 to=0 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=0 to=1 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=1 to=2 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=0 to=2 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
diskwrite nodeId=0 write=currentEpoch
nodecrash id=0
nodecrash id=1
nodestart id=0
nodestart id=1
diskwrite nodeId=2 write=currentEpoch
packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=0 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=0 to=1 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=0 state=0 leader=1 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=1 to=2 state=0 leader=1 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=2 to=0 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=0 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=1 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=1 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=0 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=0 to=1 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=1 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=0 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=0 to=1 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=1 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=0 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=0 to=1 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=0 to=2 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=1 to=0 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=1 to=2 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=0 to=2 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=0 to=1 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=0 to=2 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
diskwrite nodeId=2 write=currentEpoch
diskwrite nodeId=1 write=currentEpoch



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to