[
https://issues.apache.org/jira/browse/RATIS-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Andika updated RATIS-2382:
-------------------------------
Description:
The purpose of the leadership check for every ReadIndex is for the leader to
double check that it is the latest leader before returning to the ReadIndex.
Otherwise, leader might serve stale data when there is a split brain when there
are two concurrent leaders (which might be quite rare). This leadership check
seems to cause latency increase since it causes the readIndex call to block.
One improvement to reduce RTT time without reducing consistency is to use
leader lease, but looks like it might not be enough since heartbeats might
still be sent when leader lease expired. Leader lease also might cause longer
election time which might affect the availability.
We can support leader's readIndex to simply return without needing to wait for
AE to return (similar behavior to leader having a lease). Note that allowing
leader to skip this leadership check allows split brain scenario, which causes
the reads no longer be linearizable since it allows stale reads, so we need to
add this warning explicitly. With this in mind, if there are use cases /
workloads that care more about performance and can tolerate to be "mostly"
consistent (i.e. consistent under steady state, with no election, network
partitions, etc), they can enable this to improve the latency of read index
without leader lease issue.
We can add make this configurable and make it disabled by default.
was:
The purpose of the leadership check for every ReadIndex is for the leader to
double check that it is the latest leader before returning to the ReadIndex.
Otherwise, leader might serve stale data when there is a split brain when there
are two concurrent leaders (which might be quite rare). This leadership check
seems to cause latency increase since it causes the readIndex call to block.
One improvement to reduce RTT time without reducing consistency is to use
leader lease, but looks like it might not be enough since heartbeats might
still be sent when leader lease expired. Leader lease also might cause longer
election time which might affect the availability.
We can support leader's readIndex to simply return without needing to wait for
AE to return (similar behavior to leader having a lease). Note that allowing
leader to skip this leadership check allows split brain scenario, which causes
the reads no longer be linearizable since it allows stale reads. With this in
mind, if there are use cases / workloads that care more about performance and
can tolerate to be "mostly" consistent (i.e. consistent under steady state,
with no election, network partitions, etc), they can enable this to improve the
latency of read index without leader lease issue.
We can add make this configurable and make it disabled by default.
> Support skip leadership check during ReadIndex
> ----------------------------------------------
>
> Key: RATIS-2382
> URL: https://issues.apache.org/jira/browse/RATIS-2382
> Project: Ratis
> Issue Type: Improvement
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
>
> The purpose of the leadership check for every ReadIndex is for the leader to
> double check that it is the latest leader before returning to the ReadIndex.
> Otherwise, leader might serve stale data when there is a split brain when
> there are two concurrent leaders (which might be quite rare). This leadership
> check seems to cause latency increase since it causes the readIndex call to
> block.
> One improvement to reduce RTT time without reducing consistency is to use
> leader lease, but looks like it might not be enough since heartbeats might
> still be sent when leader lease expired. Leader lease also might cause longer
> election time which might affect the availability.
> We can support leader's readIndex to simply return without needing to wait
> for AE to return (similar behavior to leader having a lease). Note that
> allowing leader to skip this leadership check allows split brain scenario,
> which causes the reads no longer be linearizable since it allows stale reads,
> so we need to add this warning explicitly. With this in mind, if there are
> use cases / workloads that care more about performance and can tolerate to be
> "mostly" consistent (i.e. consistent under steady state, with no election,
> network partitions, etc), they can enable this to improve the latency of read
> index without leader lease issue.
> We can add make this configurable and make it disabled by default.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)