[
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508695#comment-16508695
]
Vinod Kumar Vavilapalli commented on HADOOP-15527:
--------------------------------------------------
Here is the info from some debug logs I added to
hadoop/libexec/hadoop-functions.sh and after adding a while loop around the
"ps" check.
{code}
=========== 2018-06-10 00:43:31,754 vinodkv inside scripts sending SIGTERM
=========== 2018-06-10 00:43:31,756 vinodkv inside scripts SIGTERM sent,
sleeping
=========== 2018-06-10 00:43:36,759 vinodkv inside scripts 3989960 still alive!
sending sig-kill
=========== 2018-06-10 00:43:36,797 vinodkv inside scripts sigkill sent
=========== 2018-06-10 00:43:36,827 vinodkv inside scripts.. unable to kill
3989960
=========== 2018-06-10 00:43:36,846 vinodkv inside scripts.. unable to kill
3989960
=========== 2018-06-10 00:43:36,866 vinodkv inside scripts.. unable to kill
3989960
=========== 2018-06-10 00:43:36,885 vinodkv inside scripts.. unable to kill
3989960
=========== 2018-06-10 00:43:36,904 vinodkv inside scripts.. unable to kill
3989960
=========== 2018-06-10 00:43:36,924 vinodkv inside scripts.. process 3989960
finally dead
{code}
{code}
=========== 2018-06-10 00:48:00,884 vinodkv inside scripts sending SIGTERM
=========== 2018-06-10 00:48:00,886 vinodkv inside scripts SIGTERM sent,
sleeping
=========== 2018-06-10 00:48:05,890 vinodkv inside scripts 3992747 still alive!
sending sig-kill
=========== 2018-06-10 00:48:05,898 vinodkv inside scripts sigkill sent
=========== 2018-06-10 00:48:05,921 vinodkv inside scripts.. unable to kill
3992747
=========== 2018-06-10 00:48:05,938 vinodkv inside scripts.. unable to kill
3992747
=========== 2018-06-10 00:48:05,953 vinodkv inside scripts.. unable to kill
3992747
=========== 2018-06-10 00:48:05,970 vinodkv inside scripts.. unable to kill
3992747
=========== 2018-06-10 00:48:05,987 vinodkv inside scripts.. unable to kill
3992747
=========== 2018-06-10 00:48:06,006 vinodkv inside scripts.. unable to kill
3992747
=========== 2018-06-10 00:48:06,024 vinodkv inside scripts.. unable to kill
3992747
=========== 2018-06-10 00:48:06,042 vinodkv inside scripts.. process 3992747
finally dead
{code}
It takes roughly 125-145 milliseconds for RM to come down once a "kill -9" is
sent.
It is possible that it may be due to system load.
I don't have any other explanation as to why this is only happening now.
> Sometimes daemons keep running even after "kill -9" from daemon-stop script
> ---------------------------------------------------------------------------
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Vinod Kumar Vavilapalli
> Assignee: Vinod Kumar Vavilapalli
> Priority: Major
>
> I'm seeing that sometimes daemons keep running for a little while even after
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager - {{yarn --daemon
> stop nodemanager}}. Though it is possible that other daemons may run into
> this too.
> Saw this on both Centos as well as Ubuntu.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]