[ 
https://issues.apache.org/jira/browse/HDDS-11585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated HDDS-11585:
-------------------------------
    Attachment:     (was: image-2024-10-15-11-40-41-618.png)

> Add DN Ratis log purge parameters to detect slow follower
> ---------------------------------------------------------
>
>                 Key: HDDS-11585
>                 URL: https://issues.apache.org/jira/browse/HDDS-11585
>             Project: Apache Ozone
>          Issue Type: Improvement
>          Components: Ozone Datanode
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>         Attachments: image-2024-10-15-11-56-47-259.png
>
>
> Ozone Ratis pipeline seem to have indirect mechanism to detect "slow 
> follower" through the notifyInstallSnapshot mechanism.
> The idea is that if the leader already purge the logs up to the snapshot 
> index, and the leader's first index is higher than the slow follower's next 
> index (i.e. the log to replicate to slow follower has been purged), the 
> leader will send the notifyInstallSnapshot request to follower and the 
> follower will call StateMachine#notifyInstallSnapshotFromLeader API. 
> Datanode implementation of notifyInstallSnapshotFromLeader is to close the 
> pipeline. This indirectly acts as an automatic "slow follower detector" which 
> might be helpful since by default will watch for ALL_COMMITTED (i.e. log 
> index needs to be replicated in all DNs) and will increase write latency 
> considerably.
> See the following follower index lag that causes prolonged cluster write 
> degradation that required administrator to close the pipeline manually.
> !image-2024-10-15-11-56-47-259.png|width=582,height=134!
> Even after the difference of the log index between leader and follower 
> reaches > 1 million, the pipeline is not automatically closed. 
> The root cause raft.server.log.purge.upto.snapshot.index default is false. 
> This means that the leader will not purge the logs until it has been 
> replicated to the slow follower. Therefore, the notifyInstallSnapshot 
> mechanism will never be triggered.
> Just like HDDS-8131, I propose to make 
> raft.server.log.purge.upto.snapshot.index and 
> raft.server.log.purge.preservation.log.num to be configurable. The 
> recommended configuration would be
>  * raft.server.log.purge.upto.snapshot.index = true
>  * raft.server.log.purge.preservation.log.num = <SLOW_FOLLOWER_THRESHOLD>
> Other snapshot configurations such 
> raft.server.snapshot.auto.trigger.threshold (dfs.ratis.snapshot.threshold) 
> also need to be revisited.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to