[ https://issues.apache.org/jira/browse/IGNITE-22801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Lapin updated IGNITE-22801: ------------------------------------- Issue Type: Bug (was: Improvement) > Extend changePeers with term param in order to skip obsolete rebalance > ---------------------------------------------------------------------- > > Key: IGNITE-22801 > URL: https://issues.apache.org/jira/browse/IGNITE-22801 > Project: Ignite > Issue Type: Bug > Reporter: Alexander Lapin > Assignee: Alexander Lapin > Priority: Major > Labels: ignite-3 > > h3. Motivation > Both CMG and MG raft topology adjustment logic is broken because of topology > adjustments re-ordering. > From the node local point of view following topology adjustment triggers > # Add [B] as Learner -> resetLearners(B); > # Add [B, C] as Learners -> resetLearners(B,C); > may be reordered in a way that [B] will be applied after [B,C], thus node C > won't be treated as learner and will never receive its portion of data. > Worth mentioning that currently a node collocated with CMG leader/MG leader > manages corresponding raft topology adjustment. That means that if node A > believes that it's a leader collocated one it will send resetLearners, while > in reality node B is the one that is collocated with a leader, thus it's > possible to have distributed based reordering: > # Add [B] as Learners -> resetLearners(B); > # Remove [B] as Learner -> resetLearners(); > # Add [B] as Learner -> resetLearners(B); > Node A: resetLearners(B) -> is about to resetLearners() ... hangs > Node D: resetLearners(B) -> resetLearners() -> resetLearners(B) > Node A wakes up and sends resetLearners() which is incorrect, besides that it > will never return node B back because A no longer believes that it's > collocated with leader. > Node local reorderings will be covered in corresponding dedicated tickets for > CMG and MG. Within current one it's required to solve distributed reordering > issue. > h3. Definition of Done > * Сonfiguration changes proposed by an old leader should be skipped. > According to the current CMG/MG design new leader will catch up the process. -- This message was sent by Atlassian Jira (v8.20.10#820010)