siddhantsangwan commented on PR #3535:
URL: https://github.com/apache/ozone/pull/3535#issuecomment-1167244527

   @lokeshj1703 thanks for reviewing! I've added locking in 
`notifyStatusChanged` after considering the following scenario:
   
   The current SCM is the leader, and `startBalancer` is called. The lock will 
be acquired, `startBalancer` will validate state, persist configuration and 
call `startBalancingThread`. 
   
   ```
     public void startBalancer(ContainerBalancerConfiguration configuration)
         throws IllegalContainerBalancerStateException,
         InvalidContainerBalancerConfigurationException, IOException {
       lock.lock();
       try {
         // validates state, config, and then saves config
         setBalancerConfigOnStartBalancer(configuration);
         startBalancingThread();
       } finally {
         lock.unlock();
       }
     }
   
   ```
   
   Suppose at this instant there's a leader change and `notifyStatusChange` is 
called from another thread. Container Balancer cannot run in this SCM since 
it's a follower now.
   
   ```
    public void notifyStatusChanged() {
       if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
         if (isBalancerRunning()) {
           LOG.info("Stopping ContainerBalancer in this scm on status change");
           stop();
         }
       } else {
         if (shouldRun()) {
           try {
             LOG.info("Starting ContainerBalancer in this scm on status 
change");
             start();
           } catch (IllegalContainerBalancerStateException |
                   InvalidContainerBalancerConfigurationException e) {
             LOG.warn("Could not start ContainerBalancer on raft/safe-mode " +
                     "status change.", e);
           }
         }
       }
     }
   ```
   
   The first `if` condition `if (!scmContext.isLeader() || 
scmContext.isInSafeMode())` will evaluate to `true`, but `if 
(isBalancerRunning())` will be false because the former thread has still not 
started the balancing thread. So, `notifyStatusChanged` will not take any 
action. Now, `startBalancingThread` will end up starting balancer in this SCM, 
which is incorrect. The latest change will prevent this scenario from happening.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to