[
https://issues.apache.org/jira/browse/RATIS-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tsz-wo Sze resolved RATIS-1862.
-------------------------------
Fix Version/s: 3.0.0
Resolution: Fixed
The pull request is now merged. Thanks, [~tanxinyu]!
> Add the parameter whether to take Snapshot when stopping to adapt to
> different services
> ---------------------------------------------------------------------------------------
>
> Key: RATIS-1862
> URL: https://issues.apache.org/jira/browse/RATIS-1862
> Project: Ratis
> Issue Type: New Feature
> Components: server
> Reporter: Xinyu Tan
> Assignee: Xinyu Tan
> Priority: Major
> Fix For: 3.0.0
>
> Attachments: image-2023-07-28-11-18-28-876.png,
> image-2023-07-28-11-18-52-826.png, image-2023-07-28-12-59-28-050.png,
> image-2023-07-28-13-06-00-209.png
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> Recently during our daily testing, we found that when we stopped RaftServer,
> a snapshot could be triggered, taking close to 40s, even if the state in the
> statemachine had not changed. This is not in line with our expectations. If
> we want to take a snapshot for some regions, we will do so actively through
> the triggerSnapshot interface. We don't actually want the RaftServer itself
> to take snapshots when it stops
> !image-2023-07-28-12-59-28-050.png!
> After exploring the code, we found that the snapshot was triggered by the
> StateMachineUpdater, which basically triggered a snapshot whenever the
> applyIndex and commitIndex were equal when the cluster was stopped.
> !image-2023-07-28-11-18-28-876.png!
> !image-2023-07-28-11-18-52-826.png!
> !image-2023-07-28-13-06-00-209.png!
> After exploring the code, we found that the snapshot was triggered by the
> StateMachineUpdater, which basically triggered a snapshot whenever the
> applyIndex and commitIndex were equal when the cluster was stopped.
> We want to tweak the logic here. Add a triggerSnapshotWhenStopEnabled
> parameter, the default value is true. We'll put that in the
> shouldTakeSnapshot function. This is fully compatible with other existing
> services that take snapshots when the cluster is stopped. But in the case of
> IoTDB, we can set this parameter to false to avoid launching a snapshot that
> does not meet our expectations.
> What's your opinion? [~szetszwo]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)