[
https://issues.apache.org/jira/browse/IGNITE-23560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mirza Aliev updated IGNITE-23560:
---------------------------------
Description:
h3. Motivation
According to
[IEP-131|https://cwiki.apache.org/confluence/display/IGNITE/IEP-131%3A+Partition+Majority+Unavailability+Handling]
{{DisasterRecoveryManager#resetPartitions}} will be the core method for
recovering partition majority availability. At the moment this method uses
{{DistributionZoneManager#logicalTopology}} in
{{ManualGroupUpdateRequest#handle}} to retrieve alive nodes. The problem is
that {{DisasterRecoveryManager#resetPartitions}} is linearised by revision of
the meta storage, but {{DistributionZoneManager#logicalTopology}} does not take
int account the topology based on revision. This could lead to the situation
that the further logic of alive nodes check that is based on this revision
could try to work with the more up-to-date state from the
{{DistributionZoneManager#logicalTopology}}.
We need to add the way to work with {{DistributionZoneManager#logicalTopology}}
based on the revision.
h3. Implementation notes.
The way how will implement revision based
{{DistributionZoneManager#logicalTopology}} must take into account metastorage
compaction mechanism, so we cannot just take logical topology from the meta
storage based on a revision, because this data could be compacted.
h3. Definition of done
* {{DistributionZoneManager#logicalTopology}} must be expanded with the
revision, so {{DisasterRecoveryManager#resetPartitions}} could work correctly
was:
h3. Motivation
According to
[IEP-131|https://cwiki.apache.org/confluence/display/IGNITE/IEP-131%3A+Partition+Majority+Unavailability+Handling]
{{DisasterRecoveryManager#resetPartitions}} will be the core method for
recovering partition majority availability. At the moment this method uses
{{DistributionZoneManager#logicalTopology}} in
{{ManualGroupUpdateRequest#handle}} to retrieve alive nodes. The problem is
that {{DisasterRecoveryManager#resetPartitions}} is linearised by revision of
the meta storage, but {{DistributionZoneManager#logicalTopology}} does not take
int account the topology based on revision. This could lead to the situation
that the further logic of alive nodes check that is based on this revision, for
example {{DistributionZoneManager#dataNodes}}, could try to work with the more
up-to-date state from the {{DistributionZoneManager#logicalTopology}}.
We need to add the way to work with {{DistributionZoneManager#logicalTopology}}
based on the revision.
h3. Implementation notes.
The way how will implement revision based
{{DistributionZoneManager#logicalTopology}} must take into account metastorage
compaction mechanism, so we cannot just take logical topology from the meta
storage based on a revision, because this data could be compacted.
h3. Definition of done
* {{DistributionZoneManager#logicalTopology}} must be expanded with the
revision, so {{DisasterRecoveryManager#resetPartitions}} could work correctly
> resetPartitions improvements: add ability to retrieve
> DistributionZoneManager#logicalTopology based on revision.
> ----------------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-23560
> URL: https://issues.apache.org/jira/browse/IGNITE-23560
> Project: Ignite
> Issue Type: Improvement
> Reporter: Mirza Aliev
> Priority: Major
> Labels: ignite-3
>
> h3. Motivation
> According to
> [IEP-131|https://cwiki.apache.org/confluence/display/IGNITE/IEP-131%3A+Partition+Majority+Unavailability+Handling]
> {{DisasterRecoveryManager#resetPartitions}} will be the core method for
> recovering partition majority availability. At the moment this method uses
> {{DistributionZoneManager#logicalTopology}} in
> {{ManualGroupUpdateRequest#handle}} to retrieve alive nodes. The problem is
> that {{DisasterRecoveryManager#resetPartitions}} is linearised by revision of
> the meta storage, but {{DistributionZoneManager#logicalTopology}} does not
> take int account the topology based on revision. This could lead to the
> situation that the further logic of alive nodes check that is based on this
> revision could try to work with the more up-to-date state from the
> {{DistributionZoneManager#logicalTopology}}.
> We need to add the way to work with
> {{DistributionZoneManager#logicalTopology}} based on the revision.
> h3. Implementation notes.
> The way how will implement revision based
> {{DistributionZoneManager#logicalTopology}} must take into account
> metastorage compaction mechanism, so we cannot just take logical topology
> from the meta storage based on a revision, because this data could be
> compacted.
> h3. Definition of done
> * {{DistributionZoneManager#logicalTopology}} must be expanded with the
> revision, so {{DisasterRecoveryManager#resetPartitions}} could work correctly
--
This message was sent by Atlassian Jira
(v8.20.10#820010)