Song Ziyang created RATIS-1597:
----------------------------------
Summary: Delay snapshot MD5 computing to InstallSnapshot stream
process
Key: RATIS-1597
URL: https://issues.apache.org/jira/browse/RATIS-1597
Project: Ratis
Issue Type: Improvement
Components: performance
Reporter: Song Ziyang
Leader’s LogAppender while-loop checks latest snapshot info to decide wether to
send a snapshot to a follower. The SnapshotInfo includes every snapshot file
with its MD5 digest. Therefore, StateMachine is required to compute MD5 each
time it takes a snapshot.
However, for database workload, snapshot files may contain GBs of data, which
makes MD5 computing a very consuming task. Since MD5 is only used when leader
InstallSnapshot to a follower, it is better to compute MD5 along with
InstallSnapshot stream process.
Currently, InstallSnapshot stream process will break snapshot file into
fixed-size chunks and send them to followers one by one. Is it possible to
calculate MD5 when reading each chunk? This implementation can avoid
precomputing MD5 and minimize the IO cost.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)