siddhantsangwan commented on code in PR #3782:
URL: https://github.com/apache/ozone/pull/3782#discussion_r989971036


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -892,34 +79,38 @@ private void resetState() {
    */
   @Override
   public void notifyStatusChanged() {
-    boolean shouldStop = false;
-    boolean shouldRun = false;
+    if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
+      boolean shouldStop;
+      lock.lock();
+      try {
+        shouldStop = canBalancerStop();
+      } finally {
+        lock.unlock();
+      }
+      if (shouldStop) {
+        LOG.info("Stopping ContainerBalancer in this scm on status change");
+        stop();
+      }
+      return;
+    }
+
     lock.lock();
     try {
-      if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
-        shouldStop = isBalancerRunning();
-      } else {
-        shouldRun = shouldRun();
+      // else check for start
+      boolean shouldRun = shouldRun();
+      if (shouldRun && canBalancerStart()) {

Review Comment:
   I'm wondering if there's a case for which we need the `canBalancerStart()` 
check here. Is it possible that balancer would already be running in the new 
leader SCM or when an SCM gets out of safe mode?



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -1072,15 +295,15 @@ private void validateState(boolean expectedRunning)
       LOG.warn("SCM is in safe mode");
       throw new IllegalContainerBalancerStateException("SCM is in safe mode");
     }
-    lock.lock();
-    try {
-      if (isBalancerRunning() != expectedRunning) {
-        throw new IllegalContainerBalancerStateException(
-            "Expect ContainerBalancer running state to be " + expectedRunning +
-                ", but running state is actually " + isBalancerRunning());
-      }
-    } finally {
-      lock.unlock();
+    if (!expectedRunning && !canBalancerStart()) {
+      throw new IllegalContainerBalancerStateException(
+          "Expect ContainerBalancer as not running state" +
+              ", but running state is actually " + getBalancerStatus());
+    }
+    if (expectedRunning && !canBalancerStop()) {
+      throw new IllegalContainerBalancerStateException(
+          "Expect ContainerBalancer as running state" +
+              ", but running state is actually " + getBalancerStatus());

Review Comment:
   Do we need this change? It seems unnecessary.



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -1137,45 +351,21 @@ public void stopBalancer()
       TimeoutException {
     lock.lock();
     try {
-      // should be leader, out of safe mode, and currently running
       validateState(true);
-      saveConfiguration(config, false, 0);

Review Comment:
   I think we should persist to DB that balancer should not run before stopping 
it.



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -950,10 +141,41 @@ public boolean shouldRun() {
   /**
    * Checks if ContainerBalancer is currently running in this SCM.
    *
-   * @return true if the currentBalancingThread is not null, otherwise false
+   * @return true if balancer started, otherwise false
    */
   public boolean isBalancerRunning() {
-    return currentBalancingThread != null;
+    return (null != task
+        && task.getBalancerStatus() == ContainerBalancerTask.Status.RUNNING);
+  }
+
+  /**
+   * Checks if ContainerBalancer in valid state and can be started.
+   *
+   * @return true if balancer can be started, otherwise false
+   */
+  private boolean canBalancerStart() {
+    return (null == task
+        || task.getBalancerStatus() == ContainerBalancerTask.Status.STOPPED);
+  }
+
+  /**
+   * get the Container Balancer state.
+   *
+   * @return true if balancer started, otherwise false
+   */
+  public ContainerBalancerTask.Status getBalancerStatus() {
+    return null != task ? task.getBalancerStatus()
+        : ContainerBalancerTask.Status.STOPPED;
+  }
+
+  /**
+   * Checks if ContainerBalancer is in valid state to call stop.
+   *
+   * @return true if balancer can be stopped, otherwise false
+   */
+  private boolean canBalancerStop() {
+    return null != task
+        && task.getBalancerStatus() == ContainerBalancerTask.Status.RUNNING;
   }

Review Comment:
   This method seems redundant. `isBalancerRunning` can be used instead. If we 
want to have `canBalancerStop` for better readability, let's just call 
`isBalancerRunning` inside it?



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -892,34 +79,38 @@ private void resetState() {
    */
   @Override
   public void notifyStatusChanged() {
-    boolean shouldStop = false;
-    boolean shouldRun = false;
+    if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
+      boolean shouldStop;
+      lock.lock();
+      try {
+        shouldStop = canBalancerStop();
+      } finally {
+        lock.unlock();
+      }
+      if (shouldStop) {
+        LOG.info("Stopping ContainerBalancer in this scm on status change");
+        stop();
+      }
+      return;
+    }
+
     lock.lock();

Review Comment:
   Any reason for acquiring the lock twice instead of once like before?



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -1092,33 +315,24 @@ public void stop() {
     Thread balancingThread;
     lock.lock();
     try {
-      if (!isBalancerRunning()) {
-        LOG.warn("Cannot stop Container Balancer because it's not running");
+      if (!canBalancerStop()) {
+        LOG.warn("Cannot stop Container Balancer because it's not running or " 
+
+            "stopping");
         return;
       }
+      task.stop();
       balancingThread = currentBalancingThread;

Review Comment:
   NIT: Since we're not setting `currentBalancingThread` to null now, we don't 
need the temporary variable `balancingThread`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to