[
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784814#comment-13784814
]
Thawan Kooburat commented on ZOOKEEPER-1777:
--------------------------------------------
Similar to what Flavio already said, here is what I see.
Between step 4 and 5, you actually lose majority of the machine at once, so the
quorum move forward without committed txns from (1,7c) to (1,a9)
At step 6, A should get a TRUNC to (1,7b) and start getting DIFF with txn from
(2,1) to (2,4). If A ever produced a snapshot after (1, 7b) , A won't be able
to process TRUNC correctly and crash and never join a quorum.
If this is not the behavior that you observe, it is a bug in an implementation
not the protocol.
> Missing ephemeral nodes in one of the members of the ensemble
> -------------------------------------------------------------
>
> Key: ZOOKEEPER-1777
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
> Project: ZooKeeper
> Issue Type: Bug
> Components: quorum
> Affects Versions: 3.4.5
> Environment: Linux, Java 1.7
> Reporter: Germán Blanco
> Assignee: Germán Blanco
> Priority: Blocker
> Fix For: 3.4.6, 3.5.0
>
> Attachments: snaps.tar
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the
> ephemeral nodes that are present in the leader and the other follower.
> The 8 missing nodes in "the follower that is not ok" were created in the end
> of epoch 1, the ensemble is running in epoch 2.
--
This message was sent by Atlassian JIRA
(v6.1#6144)