[
https://issues.apache.org/jira/browse/IGNITE-19288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mirza Aliev updated IGNITE-19288:
---------------------------------
Description:
h3. Motivation
If new logical topology has a new nodes and nodes that left cluster then
DistributionZoneManager#scheduleTimers() schedules saveDataNodesOnScaleUp and
saveDataNodesOnScaleDown. These tasks are invoked asynchronously but use the
same entry in topologyAugmentationMap. So scale up puts entry with some
revision and then scale down puts entry with the same revision as key.
The issue is reproduced by
DistributionZoneAwaitDataNodesTest#testSeveralScaleUpAndSeveralScaleDownThenScaleUpAndScaleDown
h3. Definition of Done
* Concurrency bug is fixed.
* Test is enabled.
UPD:
The problem in general could be reproducible in very rare case, namely in the
scenario, when we have received {{LogicalTopologyEventListener#onTopologyLeap}}
and there were added and removed nodes in this Topology comparing with the
topology from metastorage.
The solution is to change representation of the
{{DistributionZoneManager.ZoneState#topologyAugmentationMap}}.
We have
{code:java}
private static class Augmentation {
/** Names of the node. */
Set<NodeWithAttributes> nodes;
/** Flag that indicates whether {@code nodeNames} should be added or
removed. */
boolean addition;
Augmentation(Set<NodeWithAttributes> nodes, boolean addition) {
this.nodes = nodes;
this.addition = addition;
}
}
{code}
I suggest to store flag addition in the {{NodeWithAttributes}}, so we could
have different types of node in terms of added or removed node for a revision
in the {{DistributionZoneManager.ZoneState#topologyAugmentationMap}}.
was:
h3. Motivation
If new logical topology has a new nodes and nodes that left cluster then
DistributionZoneManager#scheduleTimers() schedules saveDataNodesOnScaleUp and
saveDataNodesOnScaleDown. These tasks are invoked asynchronously but use the
same entry in topologyAugmentationMap. So scale up puts entry with some
revision and then scale down puts entry with the same revision as key.
The issue is reproduced by
DistributionZoneAwaitDataNodesTest#testSeveralScaleUpAndSeveralScaleDownThenScaleUpAndScaleDown
h3. Definition of Done
* Concurrency bug is fixed.
* Test is enabled.
> A race on scheduling data nodes updates if there new nodes and stopped nodes
> in logical topology
> ------------------------------------------------------------------------------------------------
>
> Key: IGNITE-19288
> URL: https://issues.apache.org/jira/browse/IGNITE-19288
> Project: Ignite
> Issue Type: Bug
> Reporter: Sergey Uttsel
> Assignee: Mirza Aliev
> Priority: Major
> Labels: ignite-3
>
> h3. Motivation
> If new logical topology has a new nodes and nodes that left cluster then
> DistributionZoneManager#scheduleTimers() schedules saveDataNodesOnScaleUp and
> saveDataNodesOnScaleDown. These tasks are invoked asynchronously but use the
> same entry in topologyAugmentationMap. So scale up puts entry with some
> revision and then scale down puts entry with the same revision as key.
> The issue is reproduced by
> DistributionZoneAwaitDataNodesTest#testSeveralScaleUpAndSeveralScaleDownThenScaleUpAndScaleDown
> h3. Definition of Done
> * Concurrency bug is fixed.
> * Test is enabled.
> UPD:
> The problem in general could be reproducible in very rare case, namely in the
> scenario, when we have received
> {{LogicalTopologyEventListener#onTopologyLeap}} and there were added and
> removed nodes in this Topology comparing with the topology from metastorage.
> The solution is to change representation of the
> {{DistributionZoneManager.ZoneState#topologyAugmentationMap}}.
> We have
> {code:java}
> private static class Augmentation {
> /** Names of the node. */
> Set<NodeWithAttributes> nodes;
> /** Flag that indicates whether {@code nodeNames} should be added or
> removed. */
> boolean addition;
> Augmentation(Set<NodeWithAttributes> nodes, boolean addition) {
> this.nodes = nodes;
> this.addition = addition;
> }
> }
> {code}
> I suggest to store flag addition in the {{NodeWithAttributes}}, so we could
> have different types of node in terms of added or removed node for a revision
> in the {{DistributionZoneManager.ZoneState#topologyAugmentationMap}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)