[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129410#comment-14129410
 ] 

Asad Saeed commented on ZOOKEEPER-2033:
---------------------------------------

I am not sure why the last committed zxid is set to the new leader's zxid on 
startup. There has been no modification to the tree and it has not been 
persisted to disk.

This currently only affects peers that are getting SNAP. Peers that use the 
DIFF mechanism to sync will only get up till the maxCommittedLog, which will 
never include this fake zxid!

If we remove this set. Any new Request that comes in would up the last 
committed zxid properly to the the new epoch when it has been properly 
persisted.

>From DataTree.java 
        /*
         * A snapshot might be in progress while we are modifying the data
         * tree. If we set lastProcessedZxid prior to making corresponding
         * change to the tree, then the zxid associated with the snapshot
         * file will be ahead of its contents. Thus, while restoring from
         * the snapshot, the restore method will not apply the transaction
         * for zxid associated with the snapshot file, since the restore
         * method assumes that transaction to be present in the snapshot.
         *
         * To avoid this, we first apply the transaction and then modify
         * lastProcessedZxid.  During restore, we correctly handle the
         * case where the snapshot contains data ahead of the zxid associated
         * with the file.
         */
        if (rc.zxid > lastProcessedZxid) {
                lastProcessedZxid = rc.zxid;
        }

> zookeeper follower fails to start after a restart immediately following a new 
> epoch
> -----------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2033
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2033
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.4.6
>            Reporter: Asad Saeed
>            Assignee: Asad Saeed
>             Fix For: 3.4.7
>
>         Attachments: ZOOKEEPER-2033.patch
>
>
> The following issue was seen when adding a new node to a zookeeper cluster.
> Reproduction steps
> 1. Create a 2 node ensemble. Write some keys.
> 2. Add another node to the ensemble, by modifying the config. Restarting 
> entire cluster.
> 3. Restart the new node before writing any new keys.
> What occurs is that the new node gets a SNAP from the newly elected leader, 
> since it is too far behind. The zxid for this snapshot is from the new epoch 
> but that is not in the committed log cache.
> On restart of this new node. The follower sends the new epoch zxid. The 
> leader looks at it's maxCommitted logs, and sees that it is not the newest 
> epoch, and therefore sends a TRUNC.
> The follower sees the TRUNC but it only has a snapshot, so it cannot truncate!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to