[ 
https://issues.apache.org/jira/browse/IGNITE-20603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20603:
---------------------------------
    Epic Link: IGNITE-20611  (was: IGNITE-20166)

> Restore topologyAugmentationMap on a node restart
> -------------------------------------------------
>
>                 Key: IGNITE-20603
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20603
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Mirza Aliev
>            Priority: Major
>              Labels: ignite-3
>
> h3. *Motivation*
> It is possible that some events were propagated to {{ms.logicalTopology}}, 
> but restart happened when we were updating topologyAugmentationMap in 
> {{DistributionZoneManager#createMetastorageTopologyListener}}. That means 
> that augmentation that must be added to {{zone.topologyAugmentationMap}} 
> wasn't added and we need to recover this information.
> h3. *Definition of done*
> On a node restart, topologyAugmentationMap must be correctly restored 
> according to {{ms.logicalTopology}} state.
> h3. *Implementation notes*
> For every zone, compare {{MS.local.logicalTopology.revision}} with 
> max(maxScUpFromMap, maxScDownFromMap). If {{logicalTopology.revision}} is 
> greater than max(maxScUpFromMap, maxScDownFromMap), that means that some 
> topology changes haven't been propagated to topologyAugmentationMap before 
> restart and appropriate timers haven't been scheduled. To fill the gap in 
> topologyAugmentationMap, compare {{MS.local.logicalTopology}} with 
> {{lastSeenLogicalTopology}} and enhance topologyAugmentationMap with the 
> nodes that did not have time to be propagated to topologyAugmentationMap 
> before restart. {{lastSeenTopology}} is calculated in the following way: we 
> read {{MS.local.dataNodes}}, also we take max(scaleUpTriggerKey, 
> scaleDownTriggerKey) and retrieve all additions and removals of nodes from 
> the topologyAugmentationMap using max(scaleUpTriggerKey, scaleDownTriggerKey) 
> as the left bound. After that apply these changes to the map with nodes 
> counters from {{MS.local.dataNodes}} and take nodes only with the positive 
> counters. This is the lastSeenTopology. Comparing it with 
> {{MS.local.logicalTopology}} will tell us which nodes were not added or 
> removed and weren't propagated to topologyAugmentationMap before restart. We 
> take these differences and add them to the topologyAugmentationMap. As a 
> revision (key for topologyAugmentationMap) take 
> {{MS.local.logicalTopology.revision}}. It is safe to take this revision, 
> because if some node was added to the {{ms.topology}} after immediate data 
> nodes recalculation, this added node must restore this immediate data nodes' 
> recalculation intent. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to