[
https://issues.apache.org/jira/browse/RATIS-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tsz-wo Sze resolved RATIS-2392.
-------------------------------
Fix Version/s: 3.3.0
Resolution: Fixed
The pull request is now merged. Thanks, [~ivanandika]!
> Leader should trigger heartbeat immediately after ReadIndex
> -----------------------------------------------------------
>
> Key: RATIS-2392
> URL: https://issues.apache.org/jira/browse/RATIS-2392
> Project: Ratis
> Issue Type: Improvement
> Components: Linearizable Read, performance
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
> Fix For: 3.3.0
>
> Attachments: image-2026-02-04-17-01-22-314.png,
> image-2026-02-04-17-01-50-676.png, image-2026-02-04-17-02-15-168.png
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> This issue is found when debugging slow {{TestOzoneShellHAWithFollowerRead}}
> (it was running as long as 10mins, although {{TestOzoneShellHA}} only runs
> for 2 minutes). It's observed that
> {{OzoneManagerProtocolServerSideTranslatorPB#submitReadRequestToOM}} latency
> is around 500ms (which is unacceptably long, exceeding disk latency) for some
> read requests. This rules out high ReadIndex network latency since the test
> is run locally.
> After long investigation and debugging, the main latency is in the follower's
> {{{}ReadRequests#waitForAdvance{}}}. However, the main follower bottleneck is
> in {{StateMachineUpdater#waitForCommit}} instead of the previous hypotheses
> of 1) slow follower {{StateMachine#applyTransactions}} 2) the {{ReadIndex}}
> network communication 3) leader's {{ReadIndex}} latency (which should already
> be solved by RATIS-2379 and RATIS-2382.
> From the debug logs, the root cause is that the follower has not seen the
> latest leader's commitIndex (e.g. leader's commitIndex is 10, but follower's
> commitIndex is 9) and therefore the follower cannot increase its commitIndex
> and apply transactions up to the higher commitIndex (see the
> {{{}StateMachineUpdater#waitForCommit{}}}). Therefore, follower is stuck
> waiting in {{StateMachineUpdater#waitForCommit}} until the follower receives
> an AppendEntries from the leader with the leaderCommit >= readIndex. The
> leader's commitIndex is only included in the {{{}AppendEntries{}}}.
> One solution is to trigger heartbeat / AppendEntries to the follower
> immediately after ReadIndex is returned. Previously I was also thinking to
> allow {{AppendEntriesRequestProto}} to be added to the
> {{ReadIndexReplyProto}} to save the number of RPC calls, but this can cause
> subtle bugs and further latency increase (follower needs to process and reply
> AppendEntries, if not the leader will need to keep sending the AppendEntries).
> After the improvement, the test goes down from 10 minutes to 2 minutes
> (similar with {{{}TestOzoneShellHA{}}}). However, when I benchmarked the
> performance (mixed read and write workloads), there are no significant
> improvements (<10% read throughput increase in some workloads). I suspect the
> performance improvement is largest if there the Ratis group is not busy (i.e.
> there are not a lot of AppendEntries) since otherwise one of these
> AppendEntries will help to carry the leaderCommit.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)