[
https://issues.apache.org/jira/browse/IGNITE-16011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mirza Aliev updated IGNITE-16011:
---------------------------------
Description:
When partition assignments are updated, we need to start raft changePeers and
handle failover scenarios.
When metastore event about partition assignments updates received we need to:
- Start all needed nodes
{code:java}
partition.assignments.pending / partition.assignments.stable{code}
- After successful starts - check if current node is the leader of raft group
(leader response must be updated by current term) and changePeers(leaderTerm,
peers). changePeers from old terms must be skipped.
Also, we need the propagation of some new events from the raft side:
* {{onLeaderElected(boolean configurationChangeInProgress)}} - must be
executed from the new leader when raft group changes the leader. Maybe we
actually need to also check if a new lease is received - we need to investigate.
* {{onChangePeersError(errorContext)}} - must be executed when any errors
during changePeers occurred
* {{onChangePeersCommitted(peers)}} - must be executed with the list of new
peers when changePeers has successfully done.
and handle them by appropriate way:
* {{{}onLeaderElected({}}}{{{}configurationChangeInProgress{}}}{{{}){}}} - we
need to:
** if {{configurationChangeInProgress}} == false and pending/planned
assignments not empty - run new changePeers. If true, do nothing.}}
* {{onChangePeersError(errorContext)}} - run failover logic
* {{onChangePeersCommitted(peers)}} - check if planned assignments is not
empty and move it to pending.
*
-- Update pending and stable partitions assignments:
{code:java}
metastoreInvoke: \\ atomic
// There we can check invariant that pending is empty, but plannes in not
partition.assignments.stable = appliedPeers
if empty(partition.assignments.planned):
partition.assignments.pending = empty
else:
partition.assignments.pending = partition.assignments.planned {code}
When {{partition.assignments.stable}} updated, we need to:
* Replace current raft client with new one, with appropriate peers
* Stop unneeded raft node
(Phase 1)
was:
When partition assignments are updated, we need to start raft changePeers and
handle failover scenarios.
When metastore event about partition assignments updates received we need to:
- Start all needed nodes
{code:java}
partition.assignments.pending / partition.assignments.stable{code}
- After successful starts - check if current node is the leader of raft group
(leader response must be updated by current term) and changePeers(leaderTerm,
peers). changePeers from old terms must be skipped.
Also, we need the propagation of some new events from the raft side:
* {{onLeaderElected(boolean configurationChangeInProgress)}} - must be
executed from the new leader when raft group changes the leader. Maybe we
actually need to also check if a new lease is received - we need to investigate.
* {{onChangePeersError(errorContext)}} - must be executed when any errors
during changePeers occurred
* {{onChangePeersCommitted(peers)}} - must be executed with the list of new
peers when changePeers has successfully done.
and handle them by appropriate way:
* {{{}onLeaderElected({}}}{{{}configurationChangeInProgress{}}}{{{}){}}} - we
need to:
** if {{configurationChangeInProgress}} == false and pending/planned
assignments not empty - run new changePeers. If true, do nothing.}}
* {{onChangePeersError(errorContext)}} - run failover logic
* {{onChangePeersCommitted(peers)}} - check if planned assignments is not
empty and move it to pending.
*
-- Update pending and stable partitions assignments:
{code:java}
metastoreInvoke: \\ atomic
partition.assignments.stable = appliedPeers
if empty(partition.assignments.planned):
partition.assignments.pending = empty
else:
partition.assignments.pending = partition.assignments.planned {code}
When {{partition.assignments.stable}} updated, we need to:
* Replace current raft client with new one, with appropriate peers
* Stop unneeded raft node
(Phase 1)
> Start new rebalance round, when partition assignments updated
> -------------------------------------------------------------
>
> Key: IGNITE-16011
> URL: https://issues.apache.org/jira/browse/IGNITE-16011
> Project: Ignite
> Issue Type: Task
> Reporter: Kirill Gusakov
> Priority: Major
> Labels: ignite-3
>
> When partition assignments are updated, we need to start raft changePeers and
> handle failover scenarios.
> When metastore event about partition assignments updates received we need to:
> - Start all needed nodes
> {code:java}
> partition.assignments.pending / partition.assignments.stable{code}
> - After successful starts - check if current node is the leader of raft
> group (leader response must be updated by current term) and
> changePeers(leaderTerm, peers). changePeers from old terms must be skipped.
> Also, we need the propagation of some new events from the raft side:
> * {{onLeaderElected(boolean configurationChangeInProgress)}} - must be
> executed from the new leader when raft group changes the leader. Maybe we
> actually need to also check if a new lease is received - we need to
> investigate.
> * {{onChangePeersError(errorContext)}} - must be executed when any errors
> during changePeers occurred
> * {{onChangePeersCommitted(peers)}} - must be executed with the list of new
> peers when changePeers has successfully done.
> and handle them by appropriate way:
> * {{{}onLeaderElected({}}}{{{}configurationChangeInProgress{}}}{{{}){}}} -
> we need to:
> ** if {{configurationChangeInProgress}} == false and pending/planned
> assignments not empty - run new changePeers. If true, do nothing.}}
> * {{onChangePeersError(errorContext)}} - run failover logic
> * {{onChangePeersCommitted(peers)}} - check if planned assignments is not
> empty and move it to pending.
> *
> -- Update pending and stable partitions assignments:
> {code:java}
> metastoreInvoke: \\ atomic
> // There we can check invariant that pending is empty, but plannes in not
> partition.assignments.stable = appliedPeers
> if empty(partition.assignments.planned):
> partition.assignments.pending = empty
> else:
> partition.assignments.pending = partition.assignments.planned {code}
> When {{partition.assignments.stable}} updated, we need to:
> * Replace current raft client with new one, with appropriate peers
> * Stop unneeded raft node
>
> (Phase 1)
--
This message was sent by Atlassian Jira
(v8.20.1#820001)