[
https://issues.apache.org/jira/browse/ZOOKEEPER-876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051899#comment-13051899
]
Diogo commented on ZOOKEEPER-876:
---------------------------------
You are both right.
@Camille: true, this line should not be commented. The idea was to do exactly
what was done before, and to add the proposal of the leader to the committedLog.
@Flavio: yes, it seems the patch is not taking all cases into account.
I am uploading a log and a patch with a test case that shows two problems:
(1) 3 processes out of 5 are started (look for string RESTART QUORUM): p3, p4
and p5. Process p3 is just some zxids behind the other processes, so it should
receive a DIFF from leader p5, but instead if receives a SNAP.
(2) After they are synchronized I stop them, and start them again (look for
RESTART QUORUM AGAIN). There was no request processed by the quorum, so p3 and
p4 should receive an empty DIFF from leader p5, but both receive a SNAP.
These problems seem to be caused by a couple of issues in the way the epochs
are compared and because the committedLog (which should contain the committed
proposals) is sometimes incomplete. I will update the patch later on based on
the current trunk and check the issues Flavio mentioned.
> Unnecessary snapshot transfers between new leader and followers
> ---------------------------------------------------------------
>
> Key: ZOOKEEPER-876
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-876
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.4.0
> Reporter: Diogo
> Assignee: Diogo
> Priority: Minor
> Fix For: 3.4.0
>
> Attachments:
> TEST-org.apache.zookeeper.test.FollowerResyncConcurrencyTest.txt,
> ZOOKEEPER-876.patch
>
>
> When starting a new leadership, unnecessary snapshot transfers happen between
> new leader and followers. This is so because of multiple small bugs.
> 1) the comparison of zxids is done based on a new proposal, instead of the
> last logged zxid. (LearnerHandler.java ~ 297)
> 2) if follower is one zxid behind, the check of the interval of committed
> logs excludes the follower. (LearnerHandler.java ~ 277)
> 3) the bug reported in ZOOKEEPER-874 (commitLogs are empty after recover).
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira