[ 
https://issues.apache.org/jira/browse/RATIS-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated RATIS-2392:
-------------------------------
    Attachment: image-2026-02-04-17-01-22-314.png

> Leader should trigger heartbeat immediately after ReadIndex
> -----------------------------------------------------------
>
>                 Key: RATIS-2392
>                 URL: https://issues.apache.org/jira/browse/RATIS-2392
>             Project: Ratis
>          Issue Type: Improvement
>          Components: Linearizable Read, performance
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>         Attachments: image-2026-02-04-17-01-22-314.png, 
> image-2026-02-04-17-01-50-676.png
>
>
> This issue is found when debugging slow {{TestOzoneShellHAWithFollowerRead}} 
> (it was running as long as 10mins, although {{TestOzoneShellHA}} only runs 
> for 2 minutes). It's observed that 
> {{OzoneManagerProtocolServerSideTranslatorPB#submitReadRequestToOM}} latency 
> is around 500ms (which is unacceptably long, exceeding disk latency) for some 
> read requests. This rules out high ReadIndex network latency since the test 
> is run locally.
> After long investigation and debugging, the main latency is in the follower's 
> {{{}ReadRequests#waitForAdvance{}}}. However, the main follower bottleneck is 
> in {{StateMachineUpdater#waitForCommit}} instead of the previous hypotheses 
> of 1) slow follower {{StateMachine#applyTransactions}} 2) the {{ReadIndex}} 
> network communication 3) leader's {{ReadIndex}} latency (which should already 
> be solved by RATIS-2379 and RATIS-2382.
> From the debug logs, the root cause is that the follower has not seen the 
> latest leader's commitIndex (e.g. leader's commitIndex is 10, but follower's 
> commitIndex is 9) and therefore the follower cannot increase its commitIndex 
> and apply transactions up to the higher commitIndex (see the 
> {{{}StateMachineUpdater#waitForCommit{}}}). Therefore, follower is stuck 
> waiting in {{StateMachineUpdater#waitForCommit}} until the follower receives 
> an AppendEntries from the leader with the leaderCommit >= readIndex. The 
> leader's commitIndex is only included in the {{{}AppendEntries{}}}.
> One solution is to trigger heartbeat / AppendEntries to the follower 
> immediately after ReadIndex is returned. Previously I was also thinking to 
> allow {{AppendEntriesRequestProto}} to be added to the 
> {{ReadIndexReplyProto}} to save the number of RPC calls, but this can cause 
> subtle bugs and further latency increase (follower needs to process and reply 
> AppendEntries, if not the leader will need to keep sending the AppendEntries).
> After the improvement, the test goes down from 10 minutes to 2 minutes 
> (similar with {{{}TestOzoneShellHA{}}}). However, I suspect the performance 
> improvement is largest if there the Ratis group is not busy (i.e. there are 
> not a lot of AppendEntries) since otherwise one of these AppendEntries will 
> help to carry the leaderCommit. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to