[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Germán Blanco updated ZOOKEEPER-1777:
-------------------------------------

    Attachment: ZOOKEEPER-1777.tar.gz

Thanks a lot Flavio and Thawan for looking into this!
I thought A does not get a TRUNC because B and C are already in a zxid that is 
higher than a9, which is the highest zxid that A has seen.
I thought a TRUNC is only sent if the leader has a lower zxid than the incoming 
learner.
The logs and data dir for this case are attached now.
This is the resulting data in the wrong follower:
[3, 2, 1, 6, zookeeper, 5, 5bis, 4]
And this is the resulting data in the leader and the other follower:
[3, 2, 1, 4bis, 6, zookeeper, 5bis]
I am not saying that this is an error in the protocol. I am only saying that I 
see it as a problem and a small modification of the protocol is one of the 
solutions. Another solution would be adding an option to force SNAP 
synchronization, and there are very likely more.

> Missing ephemeral nodes in one of the members of the ensemble
> -------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1777
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.4.5
>         Environment: Linux, Java 1.7
>            Reporter: Germán Blanco
>            Assignee: Germán Blanco
>            Priority: Blocker
>             Fix For: 3.4.6, 3.5.0
>
>         Attachments: snaps.tar, ZOOKEEPER-1777.tar.gz
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to