hanishakoneru commented on a change in pull request #2953:
URL: https://github.com/apache/ozone/pull/2953#discussion_r840763825



##########
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
##########
@@ -470,8 +469,12 @@ public void close() throws IOException {
     // OM should be shutdown as the StateMachine has shutdown.
     LOG.info("StateMachine has shutdown. Shutdown OzoneManager if not " +
         "already shutdown.");
-    ozoneManager.shutdown(new RaftException("RaftServer called shutdown on " +
-        "StateMachine"));
+    super.close();
+    if (!ozoneManager.isStopped()) {
+      ozoneManager.stop();
+    }
+    ExitUtils.terminate(1, "OM state machine is shutdown by Ratis server",
+        null, false, LOG);

Review comment:
       Since OzoneManager is the base process, termination should originate 
from there.
   If OzoneManager#stop is called, it would stop all the services but since it 
does not call OMStateMachine#close(), it would not terminate the process. 
   OzoneManager should have two methods - stop() and shutdown(). Stop() would 
just stop the services which is needed for tests. Shutdown can call stop() and 
then terminate the process. 
   Critical failures, like the one here is when StateMachine is closed, would 
then call OzoneManager#shutdown to terminate the process.

##########
File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/BackgroundPipelineCreator.java
##########
@@ -133,9 +133,6 @@ public void start() {
         .setDaemon(false)
         .setNameFormat(THREAD_NAME + " - %d")
         .setUncaughtExceptionHandler((Thread t, Throwable ex) -> {
-          // gracefully shutdown SCM.
-          scmContext.getScm().stop();
-

Review comment:
       I think this stop should be called to stop all the SCM services. 
   
   It is the terminate which needs to be moved to StorageContainerManager after 
all the other processes are stopped.

##########
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
##########
@@ -470,8 +469,12 @@ public void close() throws IOException {
     // OM should be shutdown as the StateMachine has shutdown.
     LOG.info("StateMachine has shutdown. Shutdown OzoneManager if not " +
         "already shutdown.");
-    ozoneManager.shutdown(new RaftException("RaftServer called shutdown on " +
-        "StateMachine"));
+    super.close();
+    if (!ozoneManager.isStopped()) {
+      ozoneManager.stop();
+    }
+    ExitUtils.terminate(1, "OM state machine is shutdown by Ratis server",
+        null, false, LOG);

Review comment:
       SCM could also have a similar stop() and shutdown() methods.
   
   Let me know your thoughts. Thanks.

##########
File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMHAManagerImpl.java
##########
@@ -337,7 +337,7 @@ void reloadSCMState()
   public void shutdown() throws IOException {
     if (ratisServer != null) {
       ratisServer.stop();
-      ratisServer.getSCMStateMachine().close();
+      ratisServer.getSCMStateMachine().stop();

Review comment:
       We can attempt to stop the services before shutting down. 
   But either way, I think the terminate should happen in 
StorageContainerManager as that is the base service which instantiates and has 
a handle for all the services. So each sub-service should call back 
StorageContainerManager to terminate the process.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to