siddhantsangwan commented on code in PR #6305:
URL: https://github.com/apache/ozone/pull/6305#discussion_r1522691055
##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerTask.java:
##########
@@ -856,19 +860,34 @@ private boolean moveContainer(DatanodeDetails source,
" {} failed: {}",
moveSelection.getContainerID(), source.getUuidString(),
moveSelection.getTargetNode().getUuidString(), result);
+ // add source back if move failed due to container related errors
+ if (result ==
MoveManager.MoveResult.REPLICATION_FAIL_NOT_EXIST_IN_SOURCE ||
+ result ==
MoveManager.MoveResult.REPLICATION_FAIL_EXIST_IN_TARGET ||
+ result ==
MoveManager.MoveResult.REPLICATION_FAIL_CONTAINER_NOT_CLOSED ||
+ result ==
MoveManager.MoveResult.REPLICATION_FAIL_INFLIGHT_DELETION ||
+ result ==
MoveManager.MoveResult.REPLICATION_FAIL_INFLIGHT_REPLICATION) {
+ findSourceStrategy.addBackSourceDataNode(source);
+ }
Review Comment:
We need to handle the case where the scheduled move completes with a
`MoveResult` other than `COMPLETED`, before the if...else code of `if
(future.isDone())` gets executed. In this case, this line `return result ==
MoveManager.MoveResult.COMPLETED;` will evaluate to false and the source
datanode will not get added back. However, instead of handling this case in the
whenComplete block, I suggest handling it this way:
```
if (future.isDone()) {
if (future.isCompletedExceptionally()) {
return false;
} else {
MoveManager.MoveResult result = future.join();
moveSelectionToFutureMap.put(moveSelection, future);
if (result ==
MoveManager.MoveResult.REPLICATION_FAIL_NOT_EXIST_IN_SOURCE ||
result ==
MoveManager.MoveResult.REPLICATION_FAIL_EXIST_IN_TARGET ||
result ==
MoveManager.MoveResult.REPLICATION_FAIL_CONTAINER_NOT_CLOSED ||
result ==
MoveManager.MoveResult.REPLICATION_FAIL_INFLIGHT_DELETION ||
result ==
MoveManager.MoveResult.REPLICATION_FAIL_INFLIGHT_REPLICATION) {
findSourceStrategy.addBackSourceDataNode(source);
}
return result == MoveManager.MoveResult.COMPLETED;
}
} else {
moveSelectionToFutureMap.put(moveSelection, future);
return true;
}
```
This will achieve the same result and I think is a better way mainly because
of thread safety issues. Any code in the `whenComplete` block can be called
from a thread other than the balancing thread. This means we need to care about
thread safety problems, such as concurrent accesses to the priority queue that
`findSourceStrategy.addBackSourceDataNode(source)` is changing. We're better
off avoiding this to prevent multithreading issues.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]