Mirza Aliev created IGNITE-20603: ------------------------------------ Summary: Restore topologyAugmentationMap Key: IGNITE-20603 URL: https://issues.apache.org/jira/browse/IGNITE-20603 Project: Ignite Issue Type: Bug Reporter: Mirza Aliev
h3. *Motivation* It is possible that some events were propagated to {{ms.logicalTopology}}, but restart happened when we were updating topologyAugmentationMap in {{DistributionZoneManager#createMetastorageTopologyListener}}. That means that augmentation that must be added to {{zone.topologyAugmentationMap}} wasn't added and we need to recover this information. h3. *Definition of done* h3. *Implementation notes* For every zone, compare {{MS.local.logicalTopology.revision}} with max(maxScUpFromMap, maxScDownFromMap). If {{logicalTopology.revision}} is greater than max(maxScUpFromMap, maxScDownFromMap), that means that some topology changes haven't been propagated to topologyAugmentationMap before restart and appropriate timers haven't been scheduled. To fill the gap in topologyAugmentationMap, compare {{MS.local.logicalTopology}} with {{lastSeenLogicalTopology}} and enhance topologyAugmentationMap with the nodes that did not have time to be propagated to topologyAugmentationMap before restart. {{lastSeenTopology}} is calculated in the following way: we read {{MS.local.dataNodes}}, also we take max(scaleUpTriggerKey, scaleDownTriggerKey) and retrieve all additions and removals of nodes from the topologyAugmentationMap using max(scaleUpTriggerKey, scaleDownTriggerKey) as the left bound. After that apply these changes to the map with nodes counters from {{MS.local.dataNodes}} and take nodes only with the positive counters. This is the lastSeenTopology. Comparing it with {{MS.local.logicalTopology}} will tell us which nodes were not added or removed and weren't propagated to topologyAugmentationMap before restart. We take these differences and add them to the topologyAugmentationMap. As a revision (key for topologyAugmentationMap) take {{MS.local.logicalTopology.revision}}. It is safe to take this revision, because if some node was added to the {{ms.topology}} after immediate data nodes recalculation, this added node must restore this immediate data nodes' recalculation intent. -- This message was sent by Atlassian Jira (v8.20.10#820010)