Use on-disk transaction log for learner sync up
-----------------------------------------------

                 Key: ZOOKEEPER-1413
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1413
             Project: ZooKeeper
          Issue Type: Improvement
          Components: server
    Affects Versions: 3.4.3
            Reporter: Thawan Kooburat
            Priority: Minor


Motivation:
The learner syncs up with leader by retrieving committed log from the leader. 
Currently, the leader only keeps 500 entries of recently committed log in 
memory. If the learner falls behind more than 500 updates, the leader will send 
the entire snapshot to the learner. 

With the size of the snapshot for some of our Zookeeper deployments (~10G), it 
is prohibitively expensive to send the entire snapshot over network. 
Additionally, our Zookeeper may serve more than 4K updates per seconds. As a 
result, a network hiccups for less than a second will cause the learner to use 
snapshot transfer.

Design:
Instead of looking only at committed log in memory, the leader will also look 
at transaction log on disk. The amount of transaction log kept on disk is 
configurable and the current default is 100k. This will allow Zookeeper to 
tolerate longer temporal network failure before initiating the snapshot 
transfer.  

Implementation:
We plan to add interface to the persistence layer will can be use to retrieve 
proposals from on-disk transaction log. These proposals can then be used to 
send to the learner using existing protocol. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to