[
https://issues.apache.org/jira/browse/IGNITE-22801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexander Lapin updated IGNITE-22801:
-------------------------------------
Description:
h3. Motivation
Both CMG and MG raft topology adjustment logic is broken because of topology
adjustments re-ordering.
>From the node local point of view following topology adjustment triggers
# Add [B] as Learner -> resetLearners(B);
# Add [B, C] as Learners -> resetLearners(B,C);
may be reordered in a way that [B] will be applied after [B,C], thus node C
won't be treated as learner and will never receive its portion of data.
Worth mentioning that currently a node collocated with CMG leader/MG leader
manages corresponding raft topology adjustment. That means that if node A
believes that it's a leader collocated one it will send resetLearners, while in
reality node B is the one that is collocated with a leader, thus it's possible
to have distributed based reordering:
# Add [B] as Learners -> resetLearners(B);
# Remove [B] as Learner -> resetLearners();
# Add [B] as Learner -> resetLearners(B);
Node A: resetLearners(B) -> is about to resetLearners() ... hangs
Node D: resetLearners(B) -> resetLearners() -> resetLearners(B)
Node A wakes up and sends resetLearners() which is incorrect, besides that it
will never return node B back because A no longer believes that it's collocated
with leader.
Node local reorderings will be covered in corresponding dedicated tickets for
CMG and MG. Within current one it's required to solve distributed reordering
issue.
h3. Definition of Done
* Сonfiguration changes proposed by an old leader should be skipped. According
to the current CMG/MG design new leader will catch up the process.
h3. Implementation Notes
* Basically it's required to add term to changePeers method like it's done for
changePeersAsync. In case of mismatching term, configuration adjustment
proposal should be skipped.
* Worth mentioning that currently CMG uses resetLearners, however we've agreed
to use changePeers instead.
was:
h3. Motivation
Both CMG and MG raft topology adjustment logic is broken because of topology
adjustments re-ordering.
>From the node local point of view following topology adjustment triggers
# Add [B] as Learner -> resetLearners(B);
# Add [B, C] as Learners -> resetLearners(B,C);
may be reordered in a way that [B] will be applied after [B,C], thus node C
won't be treated as learner and will never receive its portion of data.
Worth mentioning that currently a node collocated with CMG leader/MG leader
manages corresponding raft topology adjustment. That means that if node A
believes that it's a leader collocated one it will send resetLearners, while in
reality node B is the one that is collocated with a leader, thus it's possible
to have distributed based reordering:
# Add [B] as Learners -> resetLearners(B);
# Remove [B] as Learner -> resetLearners();
# Add [B] as Learner -> resetLearners(B);
Node A: resetLearners(B) -> is about to resetLearners() ... hangs
Node D: resetLearners(B) -> resetLearners() -> resetLearners(B)
Node A wakes up and sends resetLearners() which is incorrect, besides that it
will never return node B back because A no longer believes that it's collocated
with leader.
Node local reorderings will be covered in corresponding dedicated tickets for
CMG and MG. Within current one it's required to solve distributed reordering
issue.
h3. Definition of Done
* Сonfiguration changes proposed by an old leader should be skipped. According
to the current CMG/MG design new leader will catch up the process.
h3. Implementation Notes
* Basically it's required to add term to changePeers method like it's done for
changePeersAsync
> Extend changePeers with term param in order to skip obsolete rebalance
> ----------------------------------------------------------------------
>
> Key: IGNITE-22801
> URL: https://issues.apache.org/jira/browse/IGNITE-22801
> Project: Ignite
> Issue Type: Bug
> Reporter: Alexander Lapin
> Assignee: Alexander Lapin
> Priority: Major
> Labels: ignite-3
>
> h3. Motivation
> Both CMG and MG raft topology adjustment logic is broken because of topology
> adjustments re-ordering.
> From the node local point of view following topology adjustment triggers
> # Add [B] as Learner -> resetLearners(B);
> # Add [B, C] as Learners -> resetLearners(B,C);
> may be reordered in a way that [B] will be applied after [B,C], thus node C
> won't be treated as learner and will never receive its portion of data.
> Worth mentioning that currently a node collocated with CMG leader/MG leader
> manages corresponding raft topology adjustment. That means that if node A
> believes that it's a leader collocated one it will send resetLearners, while
> in reality node B is the one that is collocated with a leader, thus it's
> possible to have distributed based reordering:
> # Add [B] as Learners -> resetLearners(B);
> # Remove [B] as Learner -> resetLearners();
> # Add [B] as Learner -> resetLearners(B);
> Node A: resetLearners(B) -> is about to resetLearners() ... hangs
> Node D: resetLearners(B) -> resetLearners() -> resetLearners(B)
> Node A wakes up and sends resetLearners() which is incorrect, besides that it
> will never return node B back because A no longer believes that it's
> collocated with leader.
> Node local reorderings will be covered in corresponding dedicated tickets for
> CMG and MG. Within current one it's required to solve distributed reordering
> issue.
> h3. Definition of Done
> * Сonfiguration changes proposed by an old leader should be skipped.
> According to the current CMG/MG design new leader will catch up the process.
> h3. Implementation Notes
> * Basically it's required to add term to changePeers method like it's done
> for changePeersAsync. In case of mismatching term, configuration adjustment
> proposal should be skipped.
> * Worth mentioning that currently CMG uses resetLearners, however we've
> agreed to use changePeers instead.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)