[
https://issues.apache.org/jira/browse/IGNITE-23692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexander Lapin updated IGNITE-23692:
-------------------------------------
Epic Link: IGNITE-23694
> Pendings might be lost during force reset
> -----------------------------------------
>
> Key: IGNITE-23692
> URL: https://issues.apache.org/jira/browse/IGNITE-23692
> Project: Ignite
> Issue Type: Bug
> Reporter: Mikhail Efremov
> Priority: Critical
> Labels: ignite-3, rebalance
>
> *Description*
> During IGNITE-22036 there was found that in case of force reset we can loss
> given in reset event pending assignments. Moreover the situation may occures
> even now and without IGNITE-22036 changes. Scenario for the current state:
> {code}
> <unfinished rebalance> (UFR)
> │ { planned:null, pending:[A, B, C], stable: [A, B] }
> v
> A became a leader
> │ term == 1
> v
> A::onLeaderElected
> │ UFR pendings != null => do failover
> v
> A::(Raft)NodeImpl::changeAndLearnersPeersAsync (CPALA)
> │ success
> v
> A::onNewPeersConfigurationApplied
> │
> v
> <A's process is frozen for some reason>
> │
> v
> <user do force reset to [D, E, F]f>
> │ { planned:null, pending:[D, E, F]f, stable: [A, B] }
> v
> <A is unfrozen and somehow still a leader>─────────────────────────────┐
> │ │
> v v
> A::doStableKeySwitch
> A::TableManager::handlePendingsAssignmentsEvent
> │ MS::invoke -> success │
> v v
> { planned:null, pending:null, stable: [A, B, C] } ┌─> So, can we lost [D, E,
> F] from this event?
> │ │ Guess so because we do
> MS:get on pendings
> v │ and if [D, E, F]f's
> revision less that the
> [D, E, F]f are lost? ─────────────────────────────┘ current (after parallel
> stable switch)
> then [D, E, F]f are
> "stale" and lost then.
> {code}
> In case of IGNITE-22036 we also could have the follow picture:
> {code}
> <unfinished rebalance> (UFR)
> │ { planned:null, pending:[A, B, C], stable: [A, B] }
> v
> B is elected as a leader
> │ term=1
> v
> A becames a Primary Replica (PR)
> │ enlistment token == 1
> v
> A::Replica::onPrimaryElected
> │ set a callback to PR on leader elected and do failover with pendings == [A,
> B, C] != null
> v
> A::Replica::getCurrentPendingAssignments -> [A, B, C]
> │
> v
> <A's process is frozen for some reason right before CPALA>
> │
> v
> <user do force reset to [D, E, F]f>
> │ { planned:null, pending:[D, E, F]f, stable: [A, B] }
> v
> B becames PR
> │ emlistment token == 2
> v
> B::Replica::onPrimaryElected
> │ set a callback to PR on leader elected and do failover with pendings == [D,
> E, F]f != null
> v
> B::Replica::getCurrentPendingAssignments -> [D, E, F]f
> │
> v
> <B's process is frozen for some reason right before CPALA>
> │ <A rise up>
> v
> A::CPALA on [ABC] with term == 1
> │ success => term=2 { planned: null, pending: null, stable: [A, B, C] }
> v
> B rise up and fails because term=1 != new term=2
> │
> v
> [D, E, F]f are lost.
> {code}
> As we can see, both cases allow to loose given force reset pendings. As a
> solution we may enhance Metastore's invoke process while
> {{doStableKeySwitch}} through improving the invoke's condition that current
> pendings must be equal to RAFT's current configuration. If the new condition
> is false, then our new RAFT's configuration is stale and we should wait until
> newer configuration with new current pending assignments will be applied
> through a failover.
> *Motivation*
> The case is reset-specific and it might be a problem for a user if during
> reset the user sees that there no effect.
> *Definition of Done*
> # In metastore's invoke while {{doStableKeySwitch}} we add the new condition
> that metastore's current pendings must be equal to the current RAFT's
> configuration.
> # The corresponding test must be implemented in
> {{ItRebalanceDistributedTest}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)