Zilong Zhu created HDFS-17504:
---------------------------------
Summary: DN process should exit when BPServiceActor exit
Key: HDFS-17504
URL: https://issues.apache.org/jira/browse/HDFS-17504
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Zilong Zhu
BPServiceActor is a very important thread. In a non-HA cluster, the exit of the
BPServiceActor thread will cause the DN process to exit. However, in a HA
cluster, this is not the case.
I found HDFS-15651 causes BPServiceActor thread to exit and sets the
"runningState" from "RunningState.FAILED" to "RunningState.EXITED", it can be
confusing during troubleshooting.
I believe that the DN process should exit when the flag of the BPServiceActor
is set to RunningState.FAILED because at this point, the DN is unable to
recover and establish a heartbeat connection with the ANN on its own.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]