[
https://issues.apache.org/jira/browse/IGNITE-23566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17896771#comment-17896771
]
Alexander Lapin commented on IGNITE-23566:
------------------------------------------
Sounds reasonable.
> Investigate possible races between resetPartitions and infinite rebalance
> retries
> ---------------------------------------------------------------------------------
>
> Key: IGNITE-23566
> URL: https://issues.apache.org/jira/browse/IGNITE-23566
> Project: Ignite
> Issue Type: Task
> Reporter: Kirill Gusakov
> Assignee: Kirill Gusakov
> Priority: Major
> Labels: ignite-3
>
> *Motivation*
> For now our rebalance fail-over is a pretty trivial infinite loop of retries:
> - on the any issues on the catch up phase or later we call the
> onReconfigurationError listener
> - for now this listener just count the retries and call
> changePeersAndLearnersAsync logic again and again
> At the same time, we can call the resetPartitions logic and rewrite pending
> assignments, potentially at the any moment. So, we can have a race between
> rebalance retries and resetPartitions.
> *Definition of done*
> Under this ticket we need to investigate all possible issues, if any, and
> create appropriate issues to resolve.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)