[
https://issues.apache.org/jira/browse/IGNITE-23780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mirza Aliev updated IGNITE-23780:
---------------------------------
Description:
h3. Motivation
According to
[IEP-131|https://cwiki.apache.org/confluence/display/IGNITE/IEP-131%3A+Partition+Majority+Unavailability+Handling]
when node is going to start/restart, for HA partitions we must decide if we
start or do not not start raft group on this node based on the info about
stable, pending assignments received from recovered metastorage.
Below you can see combinations of stable, pending (forced or not), info if node
is presented in stable, pending, or both, and acton on restart (start raft
group or not)
Full table with all combinations and repetitions
|| # || stable || pending || in stable || in pending || in both || action ||
| 1 | empty | empty | no | no | no | nothing |
| 2 | empty | exists | no | no | no | nothing |
| 3 | empty | forced | no | no | no | nothing |
| 4 | exists | empty | yes | no | no | nothing |
| 5 | exists | exists | yes | no | no | nothing |
| 6 | exists | forced | yes | no | no | nothing |
| 7 | empty | empty | no | no | no | nothing |
| 8 | empty | exists | no | yes | no | start |
| 9 | empty | forced | no | yes | no | start |
| 10 | exists | empty | no | no | no | stop |
| 11 | exists | exists | no | yes | no | start |
| 12 | exists | forced | no | yes | no | start |
| 13 | empty | empty | no | no | no | nothing |
| 14 | empty | exists | no | no | no | nothing |
| 15 | empty | forced | no | no | no | nothing |
| 16 | exists | empty | no | no | no | nothing |
| 17 | exists | exists | yes | yes | yes | nothing |
| 18 | exists | forced | yes | yes | yes | nothing |
| 19 | empty | empty | no | no | no | nothing |
| 20 | empty | exists | no | no | no | nothing |
| 21 | empty | forced | no | no | no | nothing |
| 22 | exists | empty | no | no | no | nothing |
| 23 | exists | exists | no | no | no | stop |
| 24 | exists | forced | no | no | no | stop |
Improved table, without repetitions:
|| # || stable || pending || in stable || in pending || in both || on restart ||
| 1 | empty | empty | no | no | no | nothing |
| 2 | empty | exists | no | no | no | nothing |
| 3 | empty | forced | no | no | no | nothing |
| 4 | exists | empty | yes | no | no | start |
| 5 | exists | exists | yes | no | no | start |
| 6 | exists | forced | yes | no | no | nothing |
| 7 | empty | exists | no | yes | no | start |
| 8 | empty | forced | no | yes | no | start |
| 9 | exists | empty | no | no | no | nothing |
| 10 | exists | exists | no | yes | no | start |
| 11 | exists | forced | no | yes | no | start |
| 12 | exists | exists | yes | yes | yes | start |
| 13 | exists | forced | yes | yes | yes | start |
| 14 | exists | exists | no | no | no | nothing |
| 15 | exists | forced | no | no | no | nothing |
We have an invariant, that if the node is in a stable, but not in a forced
pending, raft on that node should not be started. This is because of this
example:
1) stable = [A, B, C]
2) pending = [A, force = true]
3) Rebalance happened, but stable switch is not happened, and user has entered
some data to A
4) full restart
5) we cannot start raft nodes on B and C based on stable, because we will lose
data on A on the step 3
Me wth [~jakutenshi] independently formulated conditions for a node to decide,
should it start raft group or not, developer of this ticket is responsible to
choose more appropriate condition:
{code:java}
if (
(stable.contains(node) && (force && pending.contains(node))) ||
(stable.contains(node) && (!force)) ||
pending.contains(node)
) {
start node
}
{code}
{code:java}
stable.contains(node)
&& !(pending.contains(node) || peinding.isForce())
|| pending.contains(node)
{code}
h3. Implementation notes
Aforementioned condition must be integrated to
{{TableManager#startPartitionAndStartClient}} in case {{boolean isRecovery ==
true}}
There is a chance that the current code base already has all needed logic, in
that case we just need to approve that this logic conform the table above.
h3. Definition of done
* Node correctly decide should it start raft group or not based on MS
assignments keys
was:
h3. Motivation
According to
[IEP-131|https://cwiki.apache.org/confluence/display/IGNITE/IEP-131%3A+Partition+Majority+Unavailability+Handling]
when node is going to start/restart, for HA partitions we must decide if we
start or do not not start raft group on this node based on the info about
stable, pending assignments received from recovered metastorage.
Below you can see combinations of stable, pending (forced or not), info if node
is presented in stable, pending, or both, and acton on restart (start raft
group or not)
Full table with all combinations and repetitions
|| # || stable || pending || in stable || in pending || in both || action ||
| 1 | empty | empty | no | no | no | nothing |
| 2 | empty | exists | no | no | no | nothing |
| 3 | empty | forced | no | no | no | nothing |
| 4 | exists | empty | yes | no | no | nothing |
| 5 | exists | exists | yes | no | no | nothing |
| 6 | exists | forced | yes | no | no | nothing |
| 7 | empty | empty | no | no | no | nothing |
| 8 | empty | exists | no | yes | no | start |
| 9 | empty | forced | no | yes | no | start |
| 10 | exists | empty | no | no | no | stop |
| 11 | exists | exists | no | yes | no | start |
| 12 | exists | forced | no | yes | no | start |
| 13 | empty | empty | no | no | no | nothing |
| 14 | empty | exists | no | no | no | nothing |
| 15 | empty | forced | no | no | no | nothing |
| 16 | exists | empty | no | no | no | nothing |
| 17 | exists | exists | yes | yes | yes | nothing |
| 18 | exists | forced | yes | yes | yes | nothing |
| 19 | empty | empty | no | no | no | nothing |
| 20 | empty | exists | no | no | no | nothing |
| 21 | empty | forced | no | no | no | nothing |
| 22 | exists | empty | no | no | no | nothing |
| 23 | exists | exists | no | no | no | stop |
| 24 | exists | forced | no | no | no | stop |
Improved table, without repetitions:
|| # || stable || pending || in stable || in pending || in both || on restart ||
| 1 | empty | empty | no | no | no | nothing |
| 2 | empty | exists | no | no | no | nothing |
| 3 | empty | forced | no | no | no | nothing |
| 4 | exists | empty | yes | no | no | start |
| 5 | exists | exists | yes | no | no | start |
| 6 | exists | forced | yes | no | no | nothing |
| 7 | empty | exists | no | yes | no | start |
| 8 | empty | forced | no | yes | no | start |
| 9 | exists | empty | no | no | no | nothing |
| 10 | exists | exists | no | yes | no | start |
| 11 | exists | forced | no | yes | no | start |
| 12 | exists | exists | yes | yes | yes | start |
| 13 | exists | forced | yes | yes | yes | start |
| 14 | exists | exists | no | no | no | nothing |
| 15 | exists | forced | no | no | no | nothing |
We have an invariant, that if the node is in a stable, but not in a forced
pending, raft on that node should not be started. This is because of this
example:
1) stable = [A, B, C]
2) pending = [A, force = true]
3) Rebalance happened, but stable switch is not happened, and user has entered
some data to A
4) full restart
5) we cannot start raft nodes on B and C based on stable, because we will lose
data on A on the step 3
Me wth [~jakutenshi] independently formulated conditions for a node to decide,
should it start raft group or not, developer of this ticket is responsible to
choose more appropriate condition:
{code:java}
if (
(stable.contains(node) && (force && pending.contains(node))) ||
(stable.contains(node) && (!force)) ||
pending.contains(node)
) {
start node
}
{code}
{code:java}
stable.contains(node)
&& !(pending.contains(node) || peinding.isForce())
|| pending.contains(node)
{code}
h3. Implementation notes
Aforementioned condition must be integrated to
{{TableManager#startPartitionAndStartClient}} in case {{boolean isRecovery ==
true}}
h3. Definition of done
* Node correctly decide should it start raft group or not based on MS
assignments keys
> Node restart behaviour for HA mode
> ----------------------------------
>
> Key: IGNITE-23780
> URL: https://issues.apache.org/jira/browse/IGNITE-23780
> Project: Ignite
> Issue Type: Improvement
> Reporter: Mirza Aliev
> Assignee: Kirill Sizov
> Priority: Major
> Labels: ignite-3
>
> h3. Motivation
> According to
> [IEP-131|https://cwiki.apache.org/confluence/display/IGNITE/IEP-131%3A+Partition+Majority+Unavailability+Handling]
> when node is going to start/restart, for HA partitions we must decide if we
> start or do not not start raft group on this node based on the info about
> stable, pending assignments received from recovered metastorage.
> Below you can see combinations of stable, pending (forced or not), info if
> node is presented in stable, pending, or both, and acton on restart (start
> raft group or not)
> Full table with all combinations and repetitions
> || # || stable || pending || in stable || in pending || in both || action ||
> | 1 | empty | empty | no | no | no | nothing |
> | 2 | empty | exists | no | no | no | nothing |
> | 3 | empty | forced | no | no | no | nothing |
> | 4 | exists | empty | yes | no | no | nothing |
> | 5 | exists | exists | yes | no | no | nothing |
> | 6 | exists | forced | yes | no | no | nothing |
> | 7 | empty | empty | no | no | no | nothing |
> | 8 | empty | exists | no | yes | no | start |
> | 9 | empty | forced | no | yes | no | start |
> | 10 | exists | empty | no | no | no | stop |
> | 11 | exists | exists | no | yes | no | start |
> | 12 | exists | forced | no | yes | no | start |
> | 13 | empty | empty | no | no | no | nothing |
> | 14 | empty | exists | no | no | no | nothing |
> | 15 | empty | forced | no | no | no | nothing |
> | 16 | exists | empty | no | no | no | nothing |
> | 17 | exists | exists | yes | yes | yes | nothing |
> | 18 | exists | forced | yes | yes | yes | nothing |
> | 19 | empty | empty | no | no | no | nothing |
> | 20 | empty | exists | no | no | no | nothing |
> | 21 | empty | forced | no | no | no | nothing |
> | 22 | exists | empty | no | no | no | nothing |
> | 23 | exists | exists | no | no | no | stop |
> | 24 | exists | forced | no | no | no | stop |
> Improved table, without repetitions:
> || # || stable || pending || in stable || in pending || in both || on restart
> ||
> | 1 | empty | empty | no | no | no | nothing |
> | 2 | empty | exists | no | no | no | nothing |
> | 3 | empty | forced | no | no | no | nothing |
> | 4 | exists | empty | yes | no | no | start |
> | 5 | exists | exists | yes | no | no | start |
> | 6 | exists | forced | yes | no | no | nothing |
> | 7 | empty | exists | no | yes | no | start |
> | 8 | empty | forced | no | yes | no | start |
> | 9 | exists | empty | no | no | no | nothing |
> | 10 | exists | exists | no | yes | no | start |
> | 11 | exists | forced | no | yes | no | start |
> | 12 | exists | exists | yes | yes | yes | start |
> | 13 | exists | forced | yes | yes | yes | start |
> | 14 | exists | exists | no | no | no | nothing |
> | 15 | exists | forced | no | no | no | nothing |
> We have an invariant, that if the node is in a stable, but not in a forced
> pending, raft on that node should not be started. This is because of this
> example:
> 1) stable = [A, B, C]
> 2) pending = [A, force = true]
> 3) Rebalance happened, but stable switch is not happened, and user has
> entered some data to A
> 4) full restart
> 5) we cannot start raft nodes on B and C based on stable, because we will
> lose data on A on the step 3
> Me wth [~jakutenshi] independently formulated conditions for a node to
> decide, should it start raft group or not, developer of this ticket is
> responsible to choose more appropriate condition:
> {code:java}
> if (
> (stable.contains(node) && (force && pending.contains(node))) ||
> (stable.contains(node) && (!force)) ||
> pending.contains(node)
> ) {
> start node
> }
> {code}
> {code:java}
> stable.contains(node)
> && !(pending.contains(node) || peinding.isForce())
> || pending.contains(node)
> {code}
> h3. Implementation notes
> Aforementioned condition must be integrated to
> {{TableManager#startPartitionAndStartClient}} in case {{boolean isRecovery ==
> true}}
> There is a chance that the current code base already has all needed logic, in
> that case we just need to approve that this logic conform the table above.
> h3. Definition of done
> * Node correctly decide should it start raft group or not based on MS
> assignments keys
--
This message was sent by Atlassian Jira
(v8.20.10#820010)