ChenSammi commented on code in PR #6587:
URL: https://github.com/apache/ozone/pull/6587#discussion_r1582542241
##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java:
##########
@@ -1130,6 +1134,21 @@ public void close() {
evictStateMachineCache();
executor.shutdown();
metrics.unRegister();
+
+ // if datanodeService is stopped , it indicates this `close` originates
+ // from `HddsDatanodeService.stop()`, otherwise, it indicates this `close`
originates
+ // from ratis.
+ if (datanodeService != null && !datanodeService.isStopped()) {
+ LOG.error("Container statemachine is closed by ratis, terminating
HddsDatanodeService");
+ // wait a while for other pipeline's ContainerStateMachine.close()
called.
+ try {
+ Thread.sleep(10000);
Review Comment:
DN has the capability to recover from a immediate crash or kill(because of
OOM). So 10s here is try to let other pipelines persist as many as possible
state to disk. Shutdown immediately, wait 5s or 10s, has no big difference.
Just think wait a while will be better, like when executor pool is shutdown.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]