[
https://issues.apache.org/jira/browse/HDDS-11585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17923209#comment-17923209
]
Ivan Andika edited comment on HDDS-11585 at 2/3/25 8:22 AM:
------------------------------------------------------------
Closing this for now.
Automatically closing the pipeline for slow follower might hide the underlying
write degradation issue and might cause pipelines to be created and closed
frequently which will cause further issues (QUASI_CLOSED containers with
different BCSID, unneeded replications).
It's better to scale the datanodes to increase the write capacity or increase
the number of pipelines per DN.
was (Author: JIRAUSER298977):
Closing this for now.
Automatically closing the pipeline for slow follower might hide the underlying
write degradation issue and might cause pipelines to be created and closed
frequently which will cause further issues (QUASI_CLOSED containers with
different BCSID, unneeded replications).
It's better to scale the datanodes to handle the write degradations or increase
the number of pipelines per DN.
> Add DN Ratis log purge parameters to close pipeline with slow follower
> ----------------------------------------------------------------------
>
> Key: HDDS-11585
> URL: https://issues.apache.org/jira/browse/HDDS-11585
> Project: Apache Ozone
> Issue Type: Improvement
> Components: Ozone Datanode
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
> Labels: pull-request-available
> Attachments: image-2024-10-15-11-56-47-259.png
>
>
> Ozone Ratis pipeline seem to have an indirect mechanism to detect follower
> with lagging follower index (i.e. "slow follower") through the
> notifyInstallSnapshot mechanism (since installSnapshot is always disabled).
> The idea is that if the leader already purge the logs up to the snapshot
> index, and the leader's first index is higher than the slow follower's next
> index (i.e. the log to replicate to slow follower has been purged), the
> leader will send the notifyInstallSnapshot request to follower and the
> follower will call StateMachine#notifyInstallSnapshotFromLeader API.
> Datanode implementation of notifyInstallSnapshotFromLeader is to close the
> pipeline. This indirectly acts as an automatic "slow follower detector" which
> might be helpful since by default will watch for ALL_COMMITTED (i.e. log
> index needs to be replicated in all DNs) and will increase write latency
> considerably.
> See the following follower index lag that causes prolonged cluster write
> degradation that required administrator to close the pipeline manually.
> !image-2024-10-15-11-56-47-259.png|width=582,height=134!
> Even after the difference of the log index between leader and follower
> reaches > 1 million, the pipeline is not automatically closed.
> The root cause is because raft.server.log.purge.upto.snapshot.index defaults
> to false. This means that the leader will not purge the logs until it has
> been replicated to the slow follower. Therefore, the notifyInstallSnapshot
> mechanism will never be triggered.
> Just like HDDS-8131, I propose to make
> raft.server.log.purge.upto.snapshot.index and
> raft.server.log.purge.preservation.log.num to be configurable. The
> recommended configuration would be
> * raft.server.log.purge.upto.snapshot.index = true
> * raft.server.log.purge.preservation.log.num = <SLOW_FOLLOWER_THRESHOLD>
> Other snapshot configurations such might need to be revisited
> * raft.server.snapshot.auto.trigger.threshold
> (hdds.ratis.snapshot.threshold) = default is 100,000
> * raft.server.log.purge.gap (hdds.container.ratis.log.purge.gap) = default
> is 1,000,000
> ** This might be too large, we can reduce it to something like 100,000
> Alternatively, a more precise and straightforward way is to use
> https://issues.apache.org/jira/browse/RATIS-2156.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]