[
https://issues.apache.org/jira/browse/IGNITE-24069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mikhail Pochatkin reassigned IGNITE-24069:
------------------------------------------
Assignee: Vadim Kolodin
> Turn the pending assignments into a queue
> -----------------------------------------
>
> Key: IGNITE-24069
> URL: https://issues.apache.org/jira/browse/IGNITE-24069
> Project: Ignite
> Issue Type: Improvement
> Reporter: Denis Chudov
> Assignee: Vadim Kolodin
> Priority: Major
> Labels: ignite-3
>
> *Motivation*
> In Raft, the configuration switch requires joint consensus, where the nodes
> from old and new configurations are included with corresponding roles. So, we
> cannot just include any node as a follower into the new configuration having
> it as a learner in the previous one. The rule of joint consensus requires
> that this node should be removed as a learner and after that included into
> the next configuration as a peer, so there will be two configuration
> switches. The downgrading should look the same.
> The handlers of the pending and stable assignments’ switch should be aware of
> the changes when some node (let’s say, node A) is turned from a learner into
> the peer or otherwise, from peer to learner. There should be two consequent
> configuration switches for either upgrade or downgrade, where in the first
> one, node A will be removed as the learner, in the second one, it will be
> added as peer.
> The values for meta storage pending assignments prefix "assignments.pending."
> should be turned into a queue of pending assignments. It is created for a
> replication group by the rebalance trigger or during the switch of planned
> assignments to pending, when it is detected that the direct transition from
> stable assignments to pending is not possible. It will store the queue of
> assignments, where each of them will contain some intermediate state of Raft
> configuration, and only the last assignments in the queue will be the target
> assignments.
> It is important that the whole queue is logically the one rebalance,
> scheduled by a single trigger. It can be modified only in the process of
> rebalancing. The meaning of stable and planned assignments is not changed,
> and the stable assignments’ switch happens only after the whole pending
> assignments queue has been processed. So, no replicas should be stopped until
> that moment (only Raft configurations may be changed), because replicas are
> stopped and storages are deleted only by the stable assignments’ change
> listener.
> *Definition of done*
> Pending assignments are turned into a queue without the change in the logic.
> This is the pre-requisite for further changes.
> Pending assignments’ change handler should process the first element of PAQ,
> performing changePeersAndLearnersAsync() using assignments from it.
> Listeners of leader reeclection and primary replica change should also be
> adjusted.
> *Implementation notes*
> There are 2 different pending assignments: for tables and for zones (until
> data colocation is implemented and the responsibility for partitions is fully
> transferred to zones): RebalanceUtil#PENDING_ASSIGNMENTS_PREFIX and
> ZoneRebalanceUtil#PENDING_ASSIGNMENTS_PREFIX. This ticket is about them both.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)