[
https://issues.apache.org/jira/browse/IGNITE-23692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mikhail Efremov updated IGNITE-23692:
-------------------------------------
Description:
*Description*
During IGNITE-22036 there was found that in case of force reset we can loss
given in reset event pending assignments. Moreover the situation may occures
even now and without IGNITE-22036 changes. Scenario for the current state:
{code}
<unfinished rebalance> (UFR)
│ { planned:null, pending:[A, B, C], stable: [A, B] }
v
A became a leader
│ term == 1
v
A::onLeaderElected
│ UFR pendings != null => do failover
v
A::(Raft)NodeImpl::changeAndLearnersPeersAsync (CPALA)
│ success
v
A::onNewPeersConfigurationApplied
│
v
<A's process is frozen for some reason>
│
v
<user do force reset to [D, E, F]f>
│ { planned:null, pending:[D, E, F]f, stable: [A, B] }
v
<A is unfrozen and somehow still a leader>─────────────────────────────┐
│ │
v v
A::doStableKeySwitch
A::TableManager::handlePendingsAssignmentsEvent
│ MS::invoke -> success │
v v
{ planned:null, pending:null, stable: [A, B, C] } ┌─> So, can we lost [D, E, F]
from this event?
│ │ Guess so because we do
MS:get on pendings
v │ and if [D, E, F]f's
revision less that the
[D, E, F]f are lost? ─────────────────────────────┘ current (after parallel
stable switch)
then [D, E, F]f are
"stale" and lost then.
{code}
In case of IGNITE-22036 we also could have the follow picture:
{code}
<unfinished rebalance> (UFR)
│ { planned:null, pending:[A, B, C], stable: [A, B] }
v
B is elected as a leader
│ term=1
v
A becames a Primary Replica (PR)
│ enlistment token == 1
v
A::Replica::onPrimaryElected
│ set a callback to PR on leader elected and do failover with pendings == [A,
B, C] != null
v
A::Replica::getCurrentPendingAssignments -> [A, B, C]
│
v
<A's process is frozen for some reason right before CPALA>
│
v
<user do force reset to [D, E, F]f>
│ { planned:null, pending:[D, E, F]f, stable: [A, B] }
v
B becames PR
│ emlistment token == 2
v
B::Replica::onPrimaryElected
│ set a callback to PR on leader elected and do failover with pendings == [D,
E, F]f != null
v
B::Replica::getCurrentPendingAssignments -> [D, E, F]f
│
v
<B's process is frozen for some reason right before CPALA>
│ <A rise up>
v
A::CPALA on [ABC] with term == 1
│ success => term=2 { planned: null, pending: null, stable: [A, B, C] }
v
B rise up and fails because term=1 != new term=2
│
v
[D, E, F]f are lost.
{code}
As we can see, both cases allow to loose given force reset pendings. As a
solution we may enhance Metastore's invoke process while {{doStableKeySwitch}}
through improving the invoke's condition that current pendings must be equal to
RAFT's current configuration. If the new condition is false, then our new
RAFT's configuration is stale and we should wait until newer configuration with
new current pending assignments will be applied through a failover.
*Motivation*
The case is reset-specific and it might be a problem for a user if during reset
the user sees that there no effect.
*Definition of Done*
# In metastore's invoke while {{doStableKeySwitch}} we add the new condition
that metastore's current pendings must be equal to the current RAFT's
configuration.
# The corresponding test must be implemented in {{ItRebalanceDistributedTest}}.
was:
*Description*
During IGNITE-22036 there was found that in case of force reset we can loss
given in reset event pending assignments. Moreover the situation may occures
even now and without IGNITE-22036 changes. Scenario for the current state:
{code:title=|language=none|collapse=true}<unfinished rebalance> (UFR)
│ { planned:null, pending:[A, B, C], stable: [A, B] }
v
A became a leader
│ term == 1
v
A::onLeaderElected
│ UFR pendings != null => do failover
v
A::(Raft)NodeImpl::changeAndLearnersPeersAsync (CPALA)
│ success
v
A::onNewPeersConfigurationApplied
│
v
<A's process is frozen for some reason>
│
v
<user do force reset to [D, E, F]f>
│ { planned:null, pending:[D, E, F]f, stable: [A, B] }
v
<A is unfrozen and somehow still a leader>─────────────────────────────┐
│ │
v v
A::doStableKeySwitch
A::TableManager::handlePendingsAssignmentsEvent
│ MS::invoke -> success │
v v
{ planned:null, pending:null, stable: [A, B, C] } ┌─> So, can we lost [D, E, F]
from this event?
│ │ Guess so because we do
MS:get on pendings
v │ and if [D, E, F]f's
revision less that the
[D, E, F]f are lost? ─────────────────────────────┘ current (after parallel
stable switch)
then [D, E, F]f are
"stale" and lost then.
{code}
In case of IGNITE-22036 we also could have the follow picture:
{code:title=|language=none|collapse=true}<unfinished rebalance> (UFR)
│ { planned:null, pending:[A, B, C], stable: [A, B] }
v
B is elected as a leader
│ term=1
v
A becames a Primary Replica (PR)
│ enlistment token == 1
v
A::Replica::onPrimaryElected
│ set a callback to PR on leader elected and do failover with pendings == [A,
B, C] != null
v
A::Replica::getCurrentPendingAssignments -> [A, B, C]
│
v
<A's process is frozen for some reason right before CPALA>
│
v
<user do force reset to [D, E, F]f>
│ { planned:null, pending:[D, E, F]f, stable: [A, B] }
v
B becames PR
│ emlistment token == 2
v
B::Replica::onPrimaryElected
│ set a callback to PR on leader elected and do failover with pendings == [D,
E, F]f != null
v
B::Replica::getCurrentPendingAssignments -> [D, E, F]f
│
v
<B's process is frozen for some reason right before CPALA>
│ <A rise up>
v
A::CPALA on [ABC] with term == 1
│ success => term=2 { planned: null, pending: null, stable: [A, B, C] }
v
B rise up and fails because term=1 != new term=2
│
v
[D, E, F]f are lost.
{code}
As we can see, both cases allow to loose given force reset pendings. As a
solution we may enhance Metastore's invoke process while {{doStableKeySwitch}}
through improving the invoke's condition that current pendings must be equal to
RAFT's current configuration. If the new condition is false, then our new
RAFT's configuration is stale and we should wait until newer configuration with
new current pending assignments will be applied through a failover.
*Motivation*
The case is reset-specific and it might be a problem for a user if during reset
the user sees that there no effect.
*Definition of Done*
# In metastore's invoke while {{doStableKeySwitch}} we add the new condition
that metastore's current pendings must be equal to the current RAFT's
configuration.
# The corresponding test must be implemented in {{ItRebalanceDistributedTest}}.
> Pendings might be lost during force reset
> -----------------------------------------
>
> Key: IGNITE-23692
> URL: https://issues.apache.org/jira/browse/IGNITE-23692
> Project: Ignite
> Issue Type: Bug
> Reporter: Mikhail Efremov
> Priority: Critical
> Labels: ignite-3, rebalance
>
> *Description*
> During IGNITE-22036 there was found that in case of force reset we can loss
> given in reset event pending assignments. Moreover the situation may occures
> even now and without IGNITE-22036 changes. Scenario for the current state:
> {code}
> <unfinished rebalance> (UFR)
> │ { planned:null, pending:[A, B, C], stable: [A, B] }
> v
> A became a leader
> │ term == 1
> v
> A::onLeaderElected
> │ UFR pendings != null => do failover
> v
> A::(Raft)NodeImpl::changeAndLearnersPeersAsync (CPALA)
> │ success
> v
> A::onNewPeersConfigurationApplied
> │
> v
> <A's process is frozen for some reason>
> │
> v
> <user do force reset to [D, E, F]f>
> │ { planned:null, pending:[D, E, F]f, stable: [A, B] }
> v
> <A is unfrozen and somehow still a leader>─────────────────────────────┐
> │ │
> v v
> A::doStableKeySwitch
> A::TableManager::handlePendingsAssignmentsEvent
> │ MS::invoke -> success │
> v v
> { planned:null, pending:null, stable: [A, B, C] } ┌─> So, can we lost [D, E,
> F] from this event?
> │ │ Guess so because we do
> MS:get on pendings
> v │ and if [D, E, F]f's
> revision less that the
> [D, E, F]f are lost? ─────────────────────────────┘ current (after parallel
> stable switch)
> then [D, E, F]f are
> "stale" and lost then.
> {code}
> In case of IGNITE-22036 we also could have the follow picture:
> {code}
> <unfinished rebalance> (UFR)
> │ { planned:null, pending:[A, B, C], stable: [A, B] }
> v
> B is elected as a leader
> │ term=1
> v
> A becames a Primary Replica (PR)
> │ enlistment token == 1
> v
> A::Replica::onPrimaryElected
> │ set a callback to PR on leader elected and do failover with pendings == [A,
> B, C] != null
> v
> A::Replica::getCurrentPendingAssignments -> [A, B, C]
> │
> v
> <A's process is frozen for some reason right before CPALA>
> │
> v
> <user do force reset to [D, E, F]f>
> │ { planned:null, pending:[D, E, F]f, stable: [A, B] }
> v
> B becames PR
> │ emlistment token == 2
> v
> B::Replica::onPrimaryElected
> │ set a callback to PR on leader elected and do failover with pendings == [D,
> E, F]f != null
> v
> B::Replica::getCurrentPendingAssignments -> [D, E, F]f
> │
> v
> <B's process is frozen for some reason right before CPALA>
> │ <A rise up>
> v
> A::CPALA on [ABC] with term == 1
> │ success => term=2 { planned: null, pending: null, stable: [A, B, C] }
> v
> B rise up and fails because term=1 != new term=2
> │
> v
> [D, E, F]f are lost.
> {code}
> As we can see, both cases allow to loose given force reset pendings. As a
> solution we may enhance Metastore's invoke process while
> {{doStableKeySwitch}} through improving the invoke's condition that current
> pendings must be equal to RAFT's current configuration. If the new condition
> is false, then our new RAFT's configuration is stale and we should wait until
> newer configuration with new current pending assignments will be applied
> through a failover.
> *Motivation*
> The case is reset-specific and it might be a problem for a user if during
> reset the user sees that there no effect.
> *Definition of Done*
> # In metastore's invoke while {{doStableKeySwitch}} we add the new condition
> that metastore's current pendings must be equal to the current RAFT's
> configuration.
> # The corresponding test must be implemented in
> {{ItRebalanceDistributedTest}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)