[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16804129#comment-16804129
 ] 

Fangmin Lv commented on ZOOKEEPER-3336:
---------------------------------------

[~NIWIS] the FastLeaderElection used currently allow to elect two leaders, but 
only one of them will be activated with majority, so it won't cause the split 
brain.

Here is the simple scenario which could cause multiple leader:

Let's say we have 5 nodes ensemble:
 # node 1, 2 were stopped at the beginning
 # node 3, 4, 5 started new round of election, and 5 was elected as leader and 
goes to waiting epoch from followers
 # node 3 is stopped before it following 5, so 4 is following 5, 5 is waiting 
for another node to join before it's activating the leadership and goes to 
broadcast
 # then node 1, 2, 3 restarted, and started a new round of leader election
 # node 4 and 5 mentioned 5 is leader, but 1, 2, 3 only following 5 when there 
is another majority confirmed it, which is not
 # node 1, 2, 3 voted for 3, 3 gets majority so it's start leading, meanwhile 5 
is still waiting for another peer to join before it's timed out

> Leader election terminated, two leaders or not following leader or not having 
> state
> -----------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3336
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3336
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: leaderElection
>    Affects Versions: 3.4.13
>         Environment: Debian, Java 8
>            Reporter: Simin Oraee
>            Priority: Major
>         Attachments: conf, zookeeper.log
>
>
> I am working on a testing tool for distributed systems. I tested Zookeeper, 
> enforcing different possible orderings of events. I encountered some 
> inconsistencies in the election of the leader. Here are the logs of 3 
> completed executions.
> I am wondering if these behaviors are expected or not.
> 1) More than one node consider themselves leaders:
> NodeCrashEvent\{id=1, nodeId=0}
> NodeStartEvent\{id=7, nodeId=0}
> MessageEvent\{id=8, predecessors=[7], from=0, to=0, leader=0, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=9, predecessors=[8, 7], from=0, to=1, leader=0, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=10, predecessors=[9, 7], from=0, to=2, leader=0, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=5, predecessors=[], from=1, to=0, leader=1, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=12, predecessors=[5, 10, 7], from=0, to=0, leader=1, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=13, predecessors=[12, 5, 7], from=0, to=1, leader=1, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=14, predecessors=[5, 13, 7], from=0, to=2, leader=1, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=11, predecessors=[5], from=1, to=1, leader=1, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=15, predecessors=[11], from=1, to=2, leader=1, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=6, predecessors=[], from=2, to=0, leader=2, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> Node 1 state: LEADING
> Node 1 final vote: Vote\{leader=1, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=17, predecessors=[6, 14, 7], from=0, to=0, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=18, predecessors=[17, 6, 7], from=0, to=1, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=19, predecessors=[18, 6, 7], from=0, to=2, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=20, predecessors=[18], from=1, to=0, leader=1, 
> state=LEADING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=16, predecessors=[6], from=2, to=1, leader=2, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=22, predecessors=[16, 20], from=1, to=2, leader=1, 
> state=LEADING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=21, predecessors=[16], from=2, to=2, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> Node 0 state: FOLLOWING
> Node 0 final vote: Vote\{leader=2, zxid=0, electionEpoch=1, peerEpoch=0}
> Node 2 state: LEADING
> Node 2 final vote: Vote\{leader=2, zxid=0, electionEpoch=1, peerEpoch=0}
> 2) There are some nodes that follow nodes other than the leaders:
> NodeCrashEvent\{id=1, nodeId=0}
> NodeStartEvent\{id=7, nodeId=0}
> MessageEvent\{id=8, predecessors=[7], from=0, to=0, leader=0, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=9, predecessors=[8, 7], from=0, to=1, leader=0, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=10, predecessors=[9, 7], from=0, to=2, leader=0, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=5, predecessors=[], from=1, to=0, leader=1, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=12, predecessors=[5, 10, 7], from=0, to=0, leader=1, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=13, predecessors=[12, 5, 7], from=0, to=1, leader=1, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=14, predecessors=[5, 13, 7], from=0, to=2, leader=1, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> Node 0 state: FOLLOWING
> Node 0 final vote: Vote\{leader=1, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=11, predecessors=[5], from=1, to=1, leader=1, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=15, predecessors=[11], from=1, to=2, leader=1, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=6, predecessors=[], from=2, to=0, leader=2, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=17, predecessors=[6, 7], from=0, to=2, leader=1, 
> state=FOLLOWING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=16, predecessors=[6], from=2, to=1, leader=2, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=19, predecessors=[16, 15], from=1, to=0, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=20, predecessors=[16, 19], from=1, to=1, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=22, predecessors=[16, 20], from=1, to=2, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=21, predecessors=[17, 19, 7], from=0, to=1, leader=1, 
> state=FOLLOWING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=18, predecessors=[16], from=2, to=2, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> Node 1 state: FOLLOWING
> Node 1 final vote: Vote\{leader=2, zxid=0, electionEpoch=1, peerEpoch=0}
> Node 2 state: LEADING
> Node 2 final vote: Vote\{leader=2, zxid=0, electionEpoch=1, peerEpoch=0}
> 3) There are some nodes that neither following nor leading
> NodeCrashEvent\{id=3, nodeId=2}
> NodeStartEvent\{id=7, nodeId=2}
> MessageEvent\{id=8, predecessors=[7], from=2, to=0, leader=2, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=9, predecessors=[8, 7], from=2, to=1, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=10, predecessors=[9, 7], from=2, to=2, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=5, predecessors=[], from=1, to=0, leader=1, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=11, predecessors=[5], from=1, to=1, leader=1, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=12, predecessors=[11], from=1, to=2, leader=1, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=13, predecessors=[12, 9], from=1, to=0, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=14, predecessors=[9, 13], from=1, to=1, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=15, predecessors=[9, 14], from=1, to=2, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=4, predecessors=[], from=0, to=0, leader=0, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=16, predecessors=[4], from=0, to=1, leader=0, state=LOOKING, 
> zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=17, predecessors=[16], from=0, to=2, leader=0, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=18, predecessors=[8, 17], from=0, to=0, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=19, predecessors=[8, 18], from=0, to=1, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> Node 2 state: LEADING
> Node 2 final vote: Vote\{leader=2, zxid=0, electionEpoch=1, peerEpoch=0}
> Node 1 state: FOLLOWING
> Node 1 final vote: Vote\{leader=2, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=20, predecessors=[8, 19], from=0, to=2, leader=2, 
> state=LOOKING, zxid=0, electionEpoch=1, peerEpoch=0}
> MessageEvent\{id=21, predecessors=[20, 7], from=2, to=0, leader=2, 
> state=LEADING, zxid=0, electionEpoch=1, peerEpoch=1}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to