[jira] [Updated] (IGNITE-23560) resetPartitions improvements: add ability to retrieve DistributionZoneManager#logicalTopology based on revision.

Mirza Aliev (Jira) Mon, 28 Oct 2024 18:02:05 -0700


     [ 
https://issues.apache.org/jira/browse/IGNITE-23560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mirza Aliev updated IGNITE-23560:
---------------------------------
    Description: 
h3. Motivation

According to 
[IEP-131|https://cwiki.apache.org/confluence/display/IGNITE/IEP-131%3A+Partition+Majority+Unavailability+Handling]
 {{DisasterRecoveryManager#resetPartitions}} will be the core method for 
recovering partition majority availability. At the moment this method uses 
{{DistributionZoneManager#logicalTopology}} in 
{{ManualGroupUpdateRequest#handle}} to retrieve alive nodes. The problem is 
that {{DisasterRecoveryManager#resetPartitions}} is linearised by revision of 
the meta storage, but {{DistributionZoneManager#logicalTopology}} does not take 
int account the topology based on revision. This could lead to the situation 
that the further logic of alive nodes check that is based on this revision 
could try to work with the more up-to-date state from the 
{{DistributionZoneManager#logicalTopology}}.

We need to add the way to work with {{DistributionZoneManager#logicalTopology}} 
based on the revision.

h3. Implementation notes.

The way how will implement revision based  
{{DistributionZoneManager#logicalTopology}} must take into account metastorage 
compaction mechanism, so we cannot just take logical topology from the meta 
storage based on a revision, because this data could be compacted.

h3. Definition of done
* {{DistributionZoneManager#logicalTopology}} must be expanded with the 
revision, so {{DisasterRecoveryManager#resetPartitions}} could work correctly

  was:
h3. Motivation

According to 
[IEP-131|https://cwiki.apache.org/confluence/display/IGNITE/IEP-131%3A+Partition+Majority+Unavailability+Handling]
 {{DisasterRecoveryManager#resetPartitions}} will be the core method for 
recovering partition majority availability. At the moment this method uses 
{{DistributionZoneManager#logicalTopology}} in 
{{ManualGroupUpdateRequest#handle}} to retrieve alive nodes. The problem is 
that {{DisasterRecoveryManager#resetPartitions}} is linearised by revision of 
the meta storage, but {{DistributionZoneManager#logicalTopology}} does not take 
int account the topology based on revision. This could lead to the situation 
that the further logic of alive nodes check that is based on this revision, for 
example {{DistributionZoneManager#dataNodes}}, could try to work with the more 
up-to-date state from the {{DistributionZoneManager#logicalTopology}}.

We need to add the way to work with {{DistributionZoneManager#logicalTopology}} 
based on the revision.

h3. Implementation notes.

The way how will implement revision based  
{{DistributionZoneManager#logicalTopology}} must take into account metastorage 
compaction mechanism, so we cannot just take logical topology from the meta 
storage based on a revision, because this data could be compacted.

h3. Definition of done
* {{DistributionZoneManager#logicalTopology}} must be expanded with the 
revision, so {{DisasterRecoveryManager#resetPartitions}} could work correctly


> resetPartitions improvements: add ability to retrieve 
> DistributionZoneManager#logicalTopology based on revision.
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-23560
>                 URL: https://issues.apache.org/jira/browse/IGNITE-23560
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Mirza Aliev
>            Priority: Major
>              Labels: ignite-3
>
> h3. Motivation
> According to 
> [IEP-131|https://cwiki.apache.org/confluence/display/IGNITE/IEP-131%3A+Partition+Majority+Unavailability+Handling]
>  {{DisasterRecoveryManager#resetPartitions}} will be the core method for 
> recovering partition majority availability. At the moment this method uses 
> {{DistributionZoneManager#logicalTopology}} in 
> {{ManualGroupUpdateRequest#handle}} to retrieve alive nodes. The problem is 
> that {{DisasterRecoveryManager#resetPartitions}} is linearised by revision of 
> the meta storage, but {{DistributionZoneManager#logicalTopology}} does not 
> take int account the topology based on revision. This could lead to the 
> situation that the further logic of alive nodes check that is based on this 
> revision could try to work with the more up-to-date state from the 
> {{DistributionZoneManager#logicalTopology}}.
> We need to add the way to work with 
> {{DistributionZoneManager#logicalTopology}} based on the revision.
> h3. Implementation notes.
> The way how will implement revision based  
> {{DistributionZoneManager#logicalTopology}} must take into account 
> metastorage compaction mechanism, so we cannot just take logical topology 
> from the meta storage based on a revision, because this data could be 
> compacted.
> h3. Definition of done
> * {{DistributionZoneManager#logicalTopology}} must be expanded with the 
> revision, so {{DisasterRecoveryManager#resetPartitions}} could work correctly



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-23560) resetPartitions improvements: add ability to retrieve DistributionZoneManager#logicalTopology based on revision.

Reply via email to