liyuheng55555 commented on code in PR #13559:
URL: https://github.com/apache/iotdb/pull/13559#discussion_r1774837824
##########
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/ProcedureManager.java:
##########
@@ -598,14 +607,109 @@ public boolean removeAINode(RemoveAINodePlan
removeAINodePlan) {
return true;
}
- // region region migration
+ public TSStatus checkRemoveDataNodes(List<TDataNodeLocation>
dataNodeLocations) {
+ // 1. Only one RemoveDataNodesProcedure is allowed in the cluster
+ Optional<Procedure<ConfigNodeProcedureEnv>> anotherRemoveProcedure =
+ this.executor.getProcedures().values().stream()
+ .filter(
+ procedure -> {
+ if (procedure instanceof RemoveDataNodesProcedure) {
+ return !procedure.isFinished();
+ }
+ return false;
+ })
+ .findAny();
+
+ String failMessage = null;
+ if (anotherRemoveProcedure.isPresent()) {
+ List<TDataNodeLocation> anotherRemoveDataNodes =
+ ((RemoveDataNodesProcedure)
anotherRemoveProcedure.get()).getRemovedDataNodes();
+ failMessage =
+ String.format(
+ "Submit RemoveDataNodesProcedure failed, "
+ + "because another RemoveDataNodesProcedure %s is already in
processing. "
+ + "IoTDB is able to have at most 1 RemoveDataNodesProcedure
at the same time. "
+ + "For further information, please search [pid%d] in log. ",
+ anotherRemoveDataNodes,
anotherRemoveProcedure.get().getProcId());
+ }
+
+ // 2. Check if the RemoveDataNodesProcedure conflicts with the
RegionMigrateProcedure
+ RemoveDataNodeManager manager = env.getRemoveDataNodeManager();
+ Set<TConsensusGroupId> removedDataNodesRegionSet =
+ manager.getRemovedDataNodesRegionSet(dataNodeLocations);
+ Optional<Procedure<ConfigNodeProcedureEnv>> conflictRegionMigrateProcedure
=
+ this.executor.getProcedures().values().stream()
+ .filter(
+ procedure -> {
+ if (procedure instanceof RegionMigrateProcedure) {
+ RegionMigrateProcedure regionMigrateProcedure =
+ (RegionMigrateProcedure) procedure;
+ if (regionMigrateProcedure.isFinished()) {
+ return false;
+ }
+ return removedDataNodesRegionSet.contains(
+ regionMigrateProcedure.getConsensusGroupId())
+ ||
dataNodeLocations.contains(regionMigrateProcedure.getDestDataNode());
+ }
+ return false;
+ })
+ .findAny();
+ if (conflictRegionMigrateProcedure.isPresent()) {
+ failMessage =
+ String.format(
+ "Submit RemoveDataNodesProcedure failed, "
+ + "because another RegionMigrateProcedure %s is already in
processing which conflicts with this RemoveDataNodesProcedure. "
+ + "The RegionMigrateProcedure is migrating the region %s to
the DataNode %s. "
+ + "For further information, please search [pid%d] in log. ",
+ conflictRegionMigrateProcedure.get().getProcId(),
+ ((RegionMigrateProcedure)
conflictRegionMigrateProcedure.get()).getConsensusGroupId(),
+ ((RegionMigrateProcedure)
conflictRegionMigrateProcedure.get()).getDestDataNode(),
+ conflictRegionMigrateProcedure.get().getProcId());
+ }
+ // 3. Check if the RegionMigrateProcedure generated by
RemoveDataNodesProcedure conflicts with
+ // each other
Review Comment:
This is the problem when trying to remove multiple datanodes at the same
time.
When the number of regions is relatively large and their distribution is
sufficiently even, it’s likely that any two datanodes will share the same
regions, which means conflict will always happen.
I think submitting the RegionMigrationProcedure in several turns will solve
this problem, but this doesn’t have to be done in v1.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]