[
https://issues.apache.org/jira/browse/KUDU-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
shenxingwuying updated KUDU-3383:
---------------------------------
Description:
As describe as https://issues.apache.org/jira/browse/KUDU-3382.
h1. Background && Motivation
Linearizability read is a very friendly for developers, kudu can support it.
h1. Issue of linearizability read from leader
We need talk about the issue.
Kudu's raft implements is a strong leader, leader's state machine is not older
than followers, and followers heartbeat timeout or receives leader election
request(leader transfer) can elect leader and switch leader.
If kudu need linearizability read, read leader is not enough, because double
leader may be exist at a very small period time.
I provide a scenarios.
!image-2022-07-20-23-17-40-718.png!
# A raft group has 3 replicas, L1, F2, F3. Their states is steady during term
1.
# If network parition, F2 and F3 loss leader's heartbeat, F3 start election,
F2 vote it.
# F3 become Leader, we can call it L3. At this moment, there are 2 leaders:
L1(1) and L3(2).
# The state will be continued until the network partition recover. The time
may be short or long.
During double leader, it's not liearizability read. So kudu should avoid double
leader at any time, pay the corresponding cost is no leader at a small period
time. Kudu should make a choice. For user usally need linearizability, so I
think kudu should support it. During a very small time no leader's
unavailability can avoid by client's fault tolerance.
Whether read leader is linearizability read, someone can make sure it or I can
do a experiment.
h1. Solution
kudu should avoid double leaders at a very small period time and network fault
happens . I review the codes, and think now the problem is exist.
To avoid the double leader's trouble,leader should be keep alive. If a leader
receives no enough heartbeats in a period of time, it shoud be leader down and
and then start another election just like follower does. Leader's timeout
should be less than follower's election.
Another scheme: Read should send heartbeat to two follow to make sure it is
valid leader.
was:
As describe as https://issues.apache.org/jira/browse/KUDU-3382.
h1. Background && Motivation
Linearizability read is a very friendly for developers, kudu can support it.
h1. Issue of linearizability read from leader
We need talk about the issue.
Kudu's raft implements is a strong leader, leader's state machine is not older
than followers, and followers heartbeat timeout or receives leader election
request(leader transfer) can elect leader and switch leader.
If kudu need linearizability read, read leader is not enough, because double
leader may be exist at a very small period time.
I provide two scenarios. The first one:
{{L1(1)----F2(1) L(1) F2(2)| /
| ---> /F3(1) L3(2)}}
# A raft group has 3 replicas, L1, F2, F3. Their states is steady during term
1.
# If network parition, F2 and F3 loss leader's heartbeat, F3 start election,
F2 vote it.
# F3 become Leader, we can call it L3. At this moment, there are 2 leaders:
L1(1) and L3(2).
# The state will be continued until the network partition recover. The time
may be short or long.
During double leader, it's not liearizability read. So kudu should avoid double
leader at any time, pay the corresponding cost is no leader at a small period
time. Kudu should make a choice. For user usally need linearizability, so I
think kudu should support it. During a very small time no leader's
unavailability can avoid by client's fault tolerance.
Whether read leader is linearizability read, someone can make sure it or I can
do a experiment.
{{L(1)----F(1) L(1) F(2)| /
| ---> /F(1) L(2)}}
h1. Solution
kudu should avoid double leaders at a very small period time and network fault
happens . I review the codes, and think now the problem is exist.
To avoid the double leader's trouble,leader should be keep alive. If a leader
receives no enough heartbeats in a period of time, it shoud be leader down and
and then start another election just like follower does. Leader's timeout
should be less than follower's election.
Another scheme: Read should send heartbeat to two follow to make sure it is
valid leader.
> About strong consistency read from leader
> -----------------------------------------
>
> Key: KUDU-3383
> URL: https://issues.apache.org/jira/browse/KUDU-3383
> Project: Kudu
> Issue Type: Improvement
> Reporter: shenxingwuying
> Priority: Major
> Attachments: image-2022-07-20-23-14-34-519.png,
> image-2022-07-20-23-17-40-718.png
>
>
> As describe as https://issues.apache.org/jira/browse/KUDU-3382.
>
>
> h1. Background && Motivation
> Linearizability read is a very friendly for developers, kudu can support it.
> h1. Issue of linearizability read from leader
> We need talk about the issue.
> Kudu's raft implements is a strong leader, leader's state machine is not
> older than followers, and followers heartbeat timeout or receives leader
> election request(leader transfer) can elect leader and switch leader.
> If kudu need linearizability read, read leader is not enough, because double
> leader may be exist at a very small period time.
> I provide a scenarios.
>
> !image-2022-07-20-23-17-40-718.png!
>
> # A raft group has 3 replicas, L1, F2, F3. Their states is steady during
> term 1.
> # If network parition, F2 and F3 loss leader's heartbeat, F3 start election,
> F2 vote it.
> # F3 become Leader, we can call it L3. At this moment, there are 2 leaders:
> L1(1) and L3(2).
> # The state will be continued until the network partition recover. The time
> may be short or long.
> During double leader, it's not liearizability read. So kudu should avoid
> double leader at any time, pay the corresponding cost is no leader at a small
> period time. Kudu should make a choice. For user usally need linearizability,
> so I think kudu should support it. During a very small time no leader's
> unavailability can avoid by client's fault tolerance.
> Whether read leader is linearizability read, someone can make sure it or I
> can do a experiment.
> h1. Solution
> kudu should avoid double leaders at a very small period time and network
> fault happens . I review the codes, and think now the problem is exist.
> To avoid the double leader's trouble,leader should be keep alive. If a leader
> receives no enough heartbeats in a period of time, it shoud be leader down
> and and then start another election just like follower does. Leader's timeout
> should be less than follower's election.
> Another scheme: Read should send heartbeat to two follow to make sure it is
> valid leader.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)