symious commented on code in PR #3751:
URL: https://github.com/apache/ozone/pull/3751#discussion_r979748292
##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -529,19 +529,10 @@ private void checkIterationMoveResults() {
allFuturesResult.get(config.getMoveTimeout().toMillis(),
TimeUnit.MILLISECONDS);
} catch (InterruptedException e) {
+ LOG.warn("Container balancer is interrupted");
Thread.currentThread().interrupt();
} catch (TimeoutException e) {
- long timeoutCounts = moveSelectionToFutureMap.entrySet().stream()
- .filter(entry -> !entry.getValue().isDone())
- .peek(entry -> {
- LOG.warn("Container move canceled for container {} from source {}"
+
- " to target {} due to timeout.",
- entry.getKey().getContainerID(),
- containerToSourceMap.get(entry.getKey().getContainerID())
- .getUuidString(),
- entry.getKey().getTargetNode().getUuidString());
- entry.getValue().cancel(true);
Review Comment:
I remember I added this cancelation because of some issue in our cluster:
even after the timeout, the request is also handled by datanodes, then many
containers have 4 replicas, causing ReplicationManager to clean the redundant
one.
The cancelation here can help eliminate these cases.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]