Tanakorn Leesatapornwongsa created ZOOKEEPER-1912:
-----------------------------------------------------
Summary: Leader election lets 2 leaders happen
Key: ZOOKEEPER-1912
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1912
Project: ZooKeeper
Issue Type: Bug
Components: leaderElection
Affects Versions: 3.4.6
Environment: Ubuntu 12.04, OpenJDK 1.6
Reporter: Tanakorn Leesatapornwongsa
Priority: Minor
In 3-node cluster, when there are 2 nodes die and reboot during leader
election, it might lead to the case that there are 2 leaders happen in the
system. Eventually, a leader that does not has follower supports and quit being
leader, but it makes us lose some availability.
I am building a tools that can reorder messages and disk write, and also inject
node crash to the system and found this bug.
These are the step of events that lead to 2 leaders at the end. My zookeeper
nodes have id = 0,1,2
packetsend from=0 to=1 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=0 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=1 to=0 state=0 leader=1 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=1 to=2 state=0 leader=1 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=1 to=0 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=0 to=1 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=1 to=2 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=0 to=2 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
diskwrite nodeId=0 write=currentEpoch
nodecrash id=0
nodecrash id=1
nodestart id=0
nodestart id=1
diskwrite nodeId=2 write=currentEpoch
packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=0 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=0 to=1 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=0 state=0 leader=1 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=1 to=2 state=0 leader=1 zxid=0 electionEpoch=1 peerEpoch=0
packetsend from=2 to=0 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=0 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=1 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=1 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=0 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=0 to=1 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=1 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=0 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=0 to=1 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=1 state=2 leader=2 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=0 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=1 to=2 state=0 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=0 to=1 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=0 to=2 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=1 to=0 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=1 to=2 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=0 to=2 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=0 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=0 to=1 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
packetsend from=2 to=1 state=0 leader=2 zxid=0 electionEpoch=2 peerEpoch=1
packetsend from=0 to=2 state=2 leader=0 zxid=0 electionEpoch=1 peerEpoch=1
diskwrite nodeId=2 write=currentEpoch
diskwrite nodeId=1 write=currentEpoch
--
This message was sent by Atlassian JIRA
(v6.2#6252)