[
https://issues.apache.org/jira/browse/HDDS-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17260277#comment-17260277
]
runzhiwang commented on HDDS-1175:
----------------------------------
bq. So you are suggesting if read requests also submitted to ratis, in the
future if follower OM starts serving read requests it will be simplified, and
code for reads will be invoked from Ratis layer.
[~bharat] Yes, it's ratis responsibility to support: follower can sever read
request
1. If read from leader
1.a. leader check "Leader Lease".
1.b. if "Leader Lease" is valid, leader wait applyIndex >= commitIndex, then
server read request
1.b. if "Leader Lease" is not valid, maybe split brain happens, so leader
need to send heartbeat to followers, if majority response,
then split brain does not happen, leader wait applyIndex >=
commitIndex, then server read request
2. If read from follower
2.a. follower send request to leader to ask for the commitIndex of leader
2.b. leader check "Leader Lease"
2.c. if "Leader Lease" is valid, leader reply to follower with the
commitIndex
2.d. if "Leader Lease" is not valid, leader need to send heartbeat to
followers, if majority response, then split brain does not happen, leader reply
to follower with the commitIndex
2.e. follower wait (follower's applyIndex) >= (leader's commitIndex),
then follower process the read request
So we can find the above work should be done in ratis, if so, ozone does not
need to care about whether read from leader or follower, whether split-brain
happens.
> Serve read requests directly from RocksDB
> -----------------------------------------
>
> Key: HDDS-1175
> URL: https://issues.apache.org/jira/browse/HDDS-1175
> Project: Hadoop Distributed Data Store
> Issue Type: Sub-task
> Components: OM HA, Ozone Manager
> Reporter: Hanisha Koneru
> Assignee: Hanisha Koneru
> Priority: Major
> Labels: pull-request-available
> Attachments: HDDS-1175.001.patch
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> We can directly server read requests from the OM's RocksDB instead of going
> through the Ratis server. OM should first check its role and only if it is
> the leader can it server read requests.
> There can be a scenario where an OM can lose its Leader status but not know
> about the new election in the ring. This OM could server stale reads for the
> duration of the heartbeat timeout but this should be acceptable (similar to
> how Standby Namenode could possibly server stale reads till it figures out
> the new status).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]