[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053696#comment-13053696
 ] 

Diogo commented on ZOOKEEPER-876:
---------------------------------

Let me first focus on the second case above. The scenario is very simple:

1 - The servers are in sync, all of them are stopped.
2 - When they are restarted, they elect a new leader and a DIFF is sent. 
3 - If they are stopped and restarted again, then a snapshot will be sent. 

There is however no difference in their state, so there is absolutely no need 
to transfer a snapshot. The problem seems to be that after the restart the 
followers have the same zxid of the last accepted update (at step 1), ie, they 
somehow forgot the NEWLEADER of step 2.

I am uploading a log that shows that. I used grep to show only the output of 
QuorumPeer. That is enough to see the problem. The patch I am sending can 
produce such log (ant test-core-java -Dtestcase="DiffOn*").


> Unnecessary snapshot transfers between new leader and followers
> ---------------------------------------------------------------
>
>                 Key: ZOOKEEPER-876
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-876
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.0
>            Reporter: Diogo
>            Assignee: Diogo
>            Priority: Minor
>             Fix For: 3.4.0
>
>         Attachments: 
> TEST-org.apache.zookeeper.test.FollowerResyncConcurrencyTest.txt, 
> ZOOKEEPER-876-problems.log, ZOOKEEPER-876-problems.patch, ZOOKEEPER-876.patch
>
>
> When starting a new leadership, unnecessary snapshot transfers happen between 
> new leader and followers. This is so because of multiple small bugs. 
> 1) the comparison of zxids is done based on a new proposal, instead of the 
> last logged zxid. (LearnerHandler.java ~ 297)
> 2) if follower is one zxid behind, the check of the interval of committed 
> logs excludes the follower. (LearnerHandler.java ~ 277)
> 3) the bug reported in ZOOKEEPER-874 (commitLogs are empty after recover).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to