[ 
https://issues.apache.org/jira/browse/IGNITE-23823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Gusakov updated IGNITE-23823:
------------------------------------
    Description: 
*Motivation*
At the moment we have an issue with the choosing the right node which should be 
the target for reset.
* stable.assignments = [A(10),B(10),C(10),D(5),E(4)]. A(10) means, that node A 
has the lastLogId=10 
* Nodes A, B, C dies
* HA reset timer exhausted
* The node D(5) choosed as the target for reset from the [D(5),E(4)] list.
* Reset process is initiated:
  * stable.assignments = [A,B,C,D,E], pending.assignments=[D(5)] 
planned.assignments=[D,E].
* Everything is looking good, *but the E node actually can have potential 
infinite queue of unprocessed messages in the local queue*, and when it will 
process them, the index be increased, for example to 6' and then to 7': E[7']
* So, after the first rebalance success we will have 
pending.assignmentes=[D(7),E(7')], stable.assignments=[D(6)]. The index of D 
increased by two because of 2 rebalance reconfigurations (joint configuration + 
target configuration entries). 
At this point we have a 

> Clean planned nodes on
> ----------------------
>
>                 Key: IGNITE-23823
>                 URL: https://issues.apache.org/jira/browse/IGNITE-23823
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Kirill Gusakov
>            Priority: Major
>
> *Motivation*
> At the moment we have an issue with the choosing the right node which should 
> be the target for reset.
> * stable.assignments = [A(10),B(10),C(10),D(5),E(4)]. A(10) means, that node 
> A has the lastLogId=10 
> * Nodes A, B, C dies
> * HA reset timer exhausted
> * The node D(5) choosed as the target for reset from the [D(5),E(4)] list.
> * Reset process is initiated:
>   * stable.assignments = [A,B,C,D,E], pending.assignments=[D(5)] 
> planned.assignments=[D,E].
> * Everything is looking good, *but the E node actually can have potential 
> infinite queue of unprocessed messages in the local queue*, and when it will 
> process them, the index be increased, for example to 6' and then to 7': E[7']
> * So, after the first rebalance success we will have 
> pending.assignmentes=[D(7),E(7')], stable.assignments=[D(6)]. The index of D 
> increased by two because of 2 rebalance reconfigurations (joint configuration 
> + target configuration entries). 
> At this point we have a 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to