Song Ziyang created RATIS-1597:
----------------------------------

             Summary: Delay snapshot MD5 computing to InstallSnapshot stream 
process
                 Key: RATIS-1597
                 URL: https://issues.apache.org/jira/browse/RATIS-1597
             Project: Ratis
          Issue Type: Improvement
          Components: performance
            Reporter: Song Ziyang


Leader’s LogAppender while-loop checks latest snapshot info to decide wether to 
send a snapshot to a follower. The SnapshotInfo includes every snapshot file 
with its MD5 digest. Therefore, StateMachine is required to compute MD5 each 
time it takes a snapshot. 
 
However, for database workload, snapshot files may contain GBs of data, which 
makes MD5 computing a very consuming task. Since MD5 is only used when leader 
InstallSnapshot to a follower, it is better to compute MD5 along with 
InstallSnapshot stream process.
 
Currently, InstallSnapshot stream process will break snapshot file into 
fixed-size chunks and send them to followers one by one. Is it possible to 
calculate MD5 when reading each chunk? This implementation can avoid 
precomputing MD5 and minimize the IO cost.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to