[
https://issues.apache.org/jira/browse/IGNITE-23823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kirill Gusakov updated IGNITE-23823:
------------------------------------
Description:
*Motivation*
At the moment we have an issue with the choosing the right node which should be
the target for reset.
* stable.assignments = [A(10),B(10),C(10),D(5),E(4)]. A(10) means, that node A
has the lastLogId=10
* Nodes A, B, C dies
* HA reset timer exhausted
* The node D(5) choosed as the target for reset from the [D(5),E(4)] list.
* Reset process is initiated:
* stable.assignments = [A,B,C,D,E], pending.assignments=[D(5)]
planned.assignments=[D,E].
* Everything is looking good, *but the E node actually can have potential
infinite queue of unprocessed messages in the local queue*, and when it will
process them, the index be increased, for example to 6' and then to 7': E[7']
* So, after the first rebalance success we will have
pending.assignmentes=[D(7),E(7')], stable.assignments=[D(6)]. The index of D
increased by two because of 2 rebalance reconfigurations (joint configuration +
target configuration entries).
At this point we have a
> Clean planned nodes on
> ----------------------
>
> Key: IGNITE-23823
> URL: https://issues.apache.org/jira/browse/IGNITE-23823
> Project: Ignite
> Issue Type: Improvement
> Reporter: Kirill Gusakov
> Priority: Major
>
> *Motivation*
> At the moment we have an issue with the choosing the right node which should
> be the target for reset.
> * stable.assignments = [A(10),B(10),C(10),D(5),E(4)]. A(10) means, that node
> A has the lastLogId=10
> * Nodes A, B, C dies
> * HA reset timer exhausted
> * The node D(5) choosed as the target for reset from the [D(5),E(4)] list.
> * Reset process is initiated:
> * stable.assignments = [A,B,C,D,E], pending.assignments=[D(5)]
> planned.assignments=[D,E].
> * Everything is looking good, *but the E node actually can have potential
> infinite queue of unprocessed messages in the local queue*, and when it will
> process them, the index be increased, for example to 6' and then to 7': E[7']
> * So, after the first rebalance success we will have
> pending.assignmentes=[D(7),E(7')], stable.assignments=[D(6)]. The index of D
> increased by two because of 2 rebalance reconfigurations (joint configuration
> + target configuration entries).
> At this point we have a
--
This message was sent by Atlassian Jira
(v8.20.10#820010)