[
https://issues.apache.org/jira/browse/RATIS-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Song Ziyang closed RATIS-1597.
------------------------------
> Delay snapshot MD5 computing to InstallSnapshot stream process
> --------------------------------------------------------------
>
> Key: RATIS-1597
> URL: https://issues.apache.org/jira/browse/RATIS-1597
> Project: Ratis
> Issue Type: Improvement
> Components: performance
> Reporter: Song Ziyang
> Assignee: Song Ziyang
> Priority: Major
> Fix For: 3.0.0
>
> Attachments: 661_review.patch
>
> Time Spent: 2h
> Remaining Estimate: 0h
>
> Leader’s LogAppender while-loop checks latest snapshot info to decide wether
> to send a snapshot to a follower. The SnapshotInfo includes every snapshot
> file with its MD5 digest. Therefore, StateMachine is required to compute MD5
> each time it takes a snapshot.
>
> However, for database workload, snapshot files may contain GBs of data, which
> makes MD5 computing a very consuming task. Since MD5 is only used when leader
> InstallSnapshot to a follower, it is better to compute MD5 along with
> InstallSnapshot stream process.
>
> Currently, InstallSnapshot stream process will break snapshot file into
> fixed-size chunks and send them to followers one by one. Is it possible to
> calculate MD5 when reading each chunk? This implementation can avoid
> precomputing MD5 and minimize the IO cost.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)