[
https://issues.apache.org/jira/browse/ZOOKEEPER-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14569824#comment-14569824
]
Alexander Shraer commented on ZOOKEEPER-2189:
---------------------------------------------
Hi [~suda],
On second thought, although its not going to solve the specific scenario you
describe, it may still be a good idea to add some check(s). For example, if a
server receives a configuration from another server (in
FastLeaderElection.java) with the same configuration version, the
configuration itself must be identical. The check should only be for
non-initial configs (see ZOOKEEPER-1783), so probably (rqv.getVersion() >
0x100000000 and rqv.getVersion() == curQV.getVersion()).
If you still think its a good idea, would you like to propose a patch ?
Regarding the config registry, I didn't have any specific system in mind,
perhaps others can advise. [~phunt][~rgs]
Thanks,
Alex
> multiple leaders can be elected when configs conflict
> -----------------------------------------------------
>
> Key: ZOOKEEPER-2189
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2189
> Project: ZooKeeper
> Issue Type: Bug
> Components: leaderElection
> Affects Versions: 3.5.0
> Reporter: Akihiro Suda
>
> This sequence leads the ensemble to a split-brain state:
> * Start server 1 (config=1:participant, 2:participant, 3:participant)
> * Start server 2 (config=1:participant, 2:participant, 3:participant)
> * 1 and 2 believe 2 is the leader
> * Start server 3 (config=1:observer, 2:observer, 3:participant)
> * 3 believes 3 is the leader, although 1 and 2 still believe 2 is the leader
> Such a split-brain ensemble is very unstable.
> Znodes can be lost easily:
> * Create some znodes on 2
> * Restart 1 and 2
> * 1, 2 and 3 can think 3 is the leader
> * znodes created on 2 are lost, as 1 and 2 sync with 3
> I consider this behavior as a bug and that ZK should fail gracefully if a
> participant is listed as an observer in the config.
> In current implementation, ZK cannot detect such an invalid config, as
> FastLeaderElection.sendNotification() sends notifications to only voting
> members and hence there is no message from observers(1 and 2) to the new
> voter (3).
> I think FastLeaderElection.sendNotification() should send notifications to
> all the members and FastLeaderElection.Messenger.WorkerReceiver.run() should
> verify acks.
> Any thoughts?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)