[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784814#comment-13784814
 ] 

Thawan Kooburat commented on ZOOKEEPER-1777:
--------------------------------------------

Similar to what Flavio already said, here is what I see. 

Between step 4 and 5, you actually lose majority of the machine at once, so the 
quorum move forward without committed txns from (1,7c) to (1,a9)

At step 6, A should get a TRUNC to (1,7b) and start getting DIFF with txn from 
(2,1) to  (2,4).  If A ever produced a snapshot after (1, 7b) , A won't be able 
to process TRUNC correctly and crash and never join a quorum. 

If this is not the behavior that you observe, it is a bug in an implementation 
not the protocol.    

> Missing ephemeral nodes in one of the members of the ensemble
> -------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1777
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.4.5
>         Environment: Linux, Java 1.7
>            Reporter: Germán Blanco
>            Assignee: Germán Blanco
>            Priority: Blocker
>             Fix For: 3.4.6, 3.5.0
>
>         Attachments: snaps.tar
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the 
> ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end 
> of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to