[jira] [Commented] (IGNITE-17056) Design rebalance cancel mechanism

Denis Chudov (Jira) Wed, 22 Feb 2023 05:38:04 -0800


    [ 
https://issues.apache.org/jira/browse/IGNITE-17056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692197#comment-17692197
 ]


Denis Chudov commented on IGNITE-17056:
---------------------------------------

[~kgusakov] lgtm.

> Design rebalance cancel mechanism
> ---------------------------------
>
>                 Key: IGNITE-17056
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17056
>             Project: Ignite
>          Issue Type: Task
>            Reporter: Mirza Aliev
>            Assignee: Kirill Gusakov
>            Priority: Major
>              Labels: ignite-3
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> There are cases when a current leader cannot perform rebalance on specified 
> set of nodes, for example, when some node from the raft group permanently 
> fails with RaftError#ECATCHUP. For such scenario retry mechanism is 
> implemented in IGNITE-16801, but we cannot retry rebalance intent infinitely, 
> so there should be implemented mechanism for canceling a rebalance. 
> Naive canceling could be implemented by removing pending key and replacing it 
> with planned key. But this approach has several crucial limitations and may 
> cause inconsistency in the current rebalance protocol, for example, when 
> there is a race between cancel and applying new assignment to the stable key 
> from the new leader. We can remove pending key right before applying new 
> assignment to the stable key, so we cannot resolve peers to ClusterIds, which 
> is made on a union of pending and stable keys.
> Also there is a case, when we can lost planned rebalance:
>  # Current leader retries failed rebalance
>  # Current leader stops being leader for some reasons and sleeps
>  # New leader performs rebalance and calls 
> RebalanceRaftGroupEventsListener#onNewPeersConfigurationApplied
>  # At this moment old leader wakes up and cancels the current rebalance, so 
> it removes pending and writes to it planned key.
>  # At this moment we receive 
> RebalanceRaftGroupEventsListener#onNewPeersConfigurationApplied from the new 
> leader, see that planned is empty, so we just delete pending key, but this is 
> not correct to delete this key as far as the rebalance that is associated to 
> the removed key hasn't been performed yet.
> Also we should consider separating scenarios for recoverable and 
> unrecoverable errors, because it might be useless to retry rebalance, if some 
> participating node fails with unrecoverable error. 
> Seems like we should properly think about introducing some failure handling 
> for such exceptional scenarios.
> New node role from https://issues.apache.org/jira/browse/IGNITE-17252 primary 
> replica, can help us to resolve this issue in a simplier way and cancel 
> rebalance from the primary replica.
>  
> As a result of this issue we must design correct algorithm for cancelling 
> hanged rebalance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (IGNITE-17056) Design rebalance cancel mechanism

Reply via email to