[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14569824#comment-14569824
 ] 

Alexander Shraer commented on ZOOKEEPER-2189:
---------------------------------------------

Hi [~suda],

On second thought, although its not going to solve the specific scenario you 
describe, it may still be a good idea to add some check(s). For example, if a 
server receives a configuration from another server (in 
FastLeaderElection.java) with the same configuration version,  the 
configuration itself must be identical.  The check should only be for 
non-initial configs (see ZOOKEEPER-1783), so probably (rqv.getVersion() > 
0x100000000 and rqv.getVersion() == curQV.getVersion()). 
If you still think its a good idea, would you like to propose a patch ?

Regarding the config registry, I didn't have any specific system in mind, 
perhaps others can advise. [~phunt][~rgs]

Thanks,
Alex

> multiple leaders can be elected when configs conflict
> -----------------------------------------------------
>
>                 Key: ZOOKEEPER-2189
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2189
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: leaderElection
>    Affects Versions: 3.5.0
>            Reporter: Akihiro Suda
>
> This sequence leads the ensemble to a split-brain state:
>  * Start server 1 (config=1:participant, 2:participant, 3:participant)
>  * Start server 2 (config=1:participant, 2:participant, 3:participant)
>  * 1 and 2 believe 2 is the leader
>  * Start server 3 (config=1:observer, 2:observer, 3:participant)
>  * 3 believes 3 is the leader, although 1 and 2 still believe 2 is the leader
> Such a split-brain ensemble is very unstable.
> Znodes can be lost easily:
>  * Create some znodes on 2
>  * Restart 1 and 2
>  * 1, 2 and 3 can think 3 is the leader
>  * znodes created on 2 are lost, as 1 and 2 sync with 3
> I consider this behavior as a bug and that ZK should fail gracefully if a 
> participant is listed as an observer in the config.
> In current implementation, ZK cannot detect such an invalid config, as 
> FastLeaderElection.sendNotification() sends notifications to only voting 
> members and hence there is no message from observers(1 and 2) to the new 
> voter (3).
> I think FastLeaderElection.sendNotification() should send notifications to 
> all the members and FastLeaderElection.Messenger.WorkerReceiver.run() should 
> verify acks.
> Any thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to