[GitHub] [ozone] symious commented on a diff in pull request #3751: HDDS-6492. Add metric for failed container moves

GitBox Mon, 26 Sep 2022 02:01:21 -0700


symious commented on code in PR #3751:
URL: https://github.com/apache/ozone/pull/3751#discussion_r979748292



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -529,19 +529,10 @@ private void checkIterationMoveResults() {
       allFuturesResult.get(config.getMoveTimeout().toMillis(),
           TimeUnit.MILLISECONDS);
     } catch (InterruptedException e) {
+      LOG.warn("Container balancer is interrupted");
       Thread.currentThread().interrupt();
     } catch (TimeoutException e) {
-      long timeoutCounts = moveSelectionToFutureMap.entrySet().stream()
-          .filter(entry -> !entry.getValue().isDone())
-          .peek(entry -> {
-            LOG.warn("Container move canceled for container {} from source {}" 
+
-                    " to target {} due to timeout.",
-                entry.getKey().getContainerID(),
-                containerToSourceMap.get(entry.getKey().getContainerID())
-                    .getUuidString(),
-                entry.getKey().getTargetNode().getUuidString());
-            entry.getValue().cancel(true);

Review Comment:
   I remember I added this cancelation because of some issue in our cluster: 
even after the timeout, the request is also handled by datanodes, then many 
containers have 4 replicas, causing ReplicationManager to clean the redundant 
one.
   
   The cancelation here can help eliminate these cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [ozone] symious commented on a diff in pull request #3751: HDDS-6492. Add metric for failed container moves

Reply via email to