Daniel Ma created HDFS-16115:
--------------------------------
Summary: Asynchronously handle BPServiceActor command mechanism
may result in BPServiceActor never fails even CommandProcessingThread is closed
with fatal error.
Key: HDFS-16115
URL: https://issues.apache.org/jira/browse/HDFS-16115
Project: Hadoop HDFS
Issue Type: Improvement
Components: datanode
Affects Versions: 3.3.1
Reporter: Daniel Ma
Fix For: 3.3.1
It is an improvement issue. Actually the issue has two sub issues:
1- BPServerActor thread handle commands from NameNode in aysnchronous way (
CommandProcessThread handle commands ), so if there are any exception or errors
happens in thread CommandProcessthread resulting the thread fails and stop,
which is not aware of it and still keep put command from namenode into queues
to be handled by CommandProcessThread
2-the second sub issue is based on the first one, if CommandProcessThread fails
owing to some non-fatal error like "can not create native thread" which is
caused by too many threads existed on the node, this kind of problem should be
given much torlerance instead of simply shudown the thread and never recover
automatically, because the non-fatal eror mention above may recover soon by
itself,
currently, Datanode BPServiceActor cannot turn to normal even when the
non-fatal error was eliminated.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]