[ 
https://issues.apache.org/jira/browse/IGNITE-22801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-22801:
-------------------------------------
    Description: 
h3. Motivation

Both CMG and MG raft topology adjustment logic is broken because of topology 
adjustments re-ordering.

>From the node local point of view following topology adjustment triggers
 # Add [B] as Learner -> resetLearners(B);
 # Add [B, C] as Learners -> resetLearners(B,C);

may be reordered in a way that [B] will be applied after [B,C], thus node C 
won't be treated as learner and will never receive its portion of data.

Worth mentioning that currently a node collocated with CMG leader/MG leader 
manages corresponding raft topology adjustment. That means that if node A 
believes that it's a leader collocated one it will send resetLearners, while in 
reality node B is the one that is collocated with a leader, thus it's possible 
to have distributed based reordering:
 # Add [B] as Learners -> resetLearners(B);
 # Remove [B] as Learner -> resetLearners();
 # Add [B] as Learner -> resetLearners(B);

Node A: resetLearners(B) -> is about to resetLearners() ... hangs

Node D: resetLearners(B) -> resetLearners() -> resetLearners(B)

Node A wakes up and sends resetLearners() which is incorrect, besides that it 
will never return node B back because A no longer believes that it's collocated 
with leader.

Node local reorderings will be covered in corresponding dedicated tickets for 
CMG and MG. Within current one it's required to solve distributed reordering 
issue.
h3. Definition of Done
 * Сonfiguration changes proposed by an old leader should be skipped. According 
to the current CMG/MG design new leader will catch up the process.

h3. Implementation Notes
 * Basically it's required to add term to changePeers method like it's done for 
changePeersAsync. In case of mismatching term, configuration adjustment 
proposal should be skipped.
 * Worth mentioning that currently CMG uses resetLearners, however we've agreed 
to use changePeers instead. 

  was:
h3. Motivation

Both CMG and MG raft topology adjustment logic is broken because of topology 
adjustments re-ordering.

>From the node local point of view following topology adjustment triggers
 # Add [B] as Learner -> resetLearners(B);
 # Add [B, C] as Learners -> resetLearners(B,C);

may be reordered in a way that [B] will be applied after [B,C], thus node C 
won't be treated as learner and will never receive its portion of data.

Worth mentioning that currently a node collocated with CMG leader/MG leader 
manages corresponding raft topology adjustment. That means that if node A 
believes that it's a leader collocated one it will send resetLearners, while in 
reality node B is the one that is collocated with a leader, thus it's possible 
to have distributed based reordering:
 # Add [B] as Learners -> resetLearners(B);
 # Remove [B] as Learner -> resetLearners();
 # Add [B] as Learner -> resetLearners(B);

Node A: resetLearners(B) -> is about to resetLearners() ... hangs

Node D: resetLearners(B) -> resetLearners() -> resetLearners(B)

Node A wakes up and sends resetLearners() which is incorrect, besides that it 
will never return node B back because A no longer believes that it's collocated 
with leader.

Node local reorderings will be covered in corresponding dedicated tickets for 
CMG and MG. Within current one it's required to solve distributed reordering 
issue.
h3. Definition of Done
 * Сonfiguration changes proposed by an old leader should be skipped. According 
to the current CMG/MG design new leader will catch up the process.

h3. Implementation Notes
 * Basically it's required to add term to changePeers method like it's done for 
changePeersAsync 


> Extend changePeers with term param in order to skip obsolete rebalance
> ----------------------------------------------------------------------
>
>                 Key: IGNITE-22801
>                 URL: https://issues.apache.org/jira/browse/IGNITE-22801
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Alexander Lapin
>            Assignee: Alexander Lapin
>            Priority: Major
>              Labels: ignite-3
>
> h3. Motivation
> Both CMG and MG raft topology adjustment logic is broken because of topology 
> adjustments re-ordering.
> From the node local point of view following topology adjustment triggers
>  # Add [B] as Learner -> resetLearners(B);
>  # Add [B, C] as Learners -> resetLearners(B,C);
> may be reordered in a way that [B] will be applied after [B,C], thus node C 
> won't be treated as learner and will never receive its portion of data.
> Worth mentioning that currently a node collocated with CMG leader/MG leader 
> manages corresponding raft topology adjustment. That means that if node A 
> believes that it's a leader collocated one it will send resetLearners, while 
> in reality node B is the one that is collocated with a leader, thus it's 
> possible to have distributed based reordering:
>  # Add [B] as Learners -> resetLearners(B);
>  # Remove [B] as Learner -> resetLearners();
>  # Add [B] as Learner -> resetLearners(B);
> Node A: resetLearners(B) -> is about to resetLearners() ... hangs
> Node D: resetLearners(B) -> resetLearners() -> resetLearners(B)
> Node A wakes up and sends resetLearners() which is incorrect, besides that it 
> will never return node B back because A no longer believes that it's 
> collocated with leader.
> Node local reorderings will be covered in corresponding dedicated tickets for 
> CMG and MG. Within current one it's required to solve distributed reordering 
> issue.
> h3. Definition of Done
>  * Сonfiguration changes proposed by an old leader should be skipped. 
> According to the current CMG/MG design new leader will catch up the process.
> h3. Implementation Notes
>  * Basically it's required to add term to changePeers method like it's done 
> for changePeersAsync. In case of mismatching term, configuration adjustment 
> proposal should be skipped.
>  * Worth mentioning that currently CMG uses resetLearners, however we've 
> agreed to use changePeers instead. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to