[
https://issues.apache.org/jira/browse/RATIS-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882101#comment-17882101
]
Tsz-wo Sze commented on RATIS-2156:
-----------------------------------
Agree. The "StatusRuntimeException: CANCELLED: RST_STREAM closed stream.
HTTP/2 error code: CANCEL" message seems to be caused by RATIS-2135.
> Notify follower slowness based on the log index
> -----------------------------------------------
>
> Key: RATIS-2156
> URL: https://issues.apache.org/jira/browse/RATIS-2156
> Project: Ratis
> Issue Type: Improvement
> Components: Leader
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
> Attachments: image-2024-09-13-18-54-04-203.png
>
>
> Currently the StateMachine.LeaderEventApi#notifyFollowerSlowness is based on
> raft.server.rpc.slowness.timeout, we saw that sometimes there are some cases
> where the rpc rtt between the leader and follower does not exceed the
> timeout, the difference of the log index between the leader and follower
> keeps increasing, i.e. the slow follower cannot catch up.
> In Ozone, this causes most watch requests with ALL_COMMITTED replication to
> timeout, causing increased latency of writes. It is better to close the
> pipeline if the slow follower cannot catch up.
> !image-2024-09-13-18-54-04-203.png|width=1408,height=244!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)