[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508695#comment-16508695
 ] 

Vinod Kumar Vavilapalli commented on HADOOP-15527:
--------------------------------------------------

Here is the info from some debug logs I added to 
hadoop/libexec/hadoop-functions.sh and after adding a while loop around the 
"ps" check.
{code}
=========== 2018-06-10 00:43:31,754 vinodkv inside scripts sending SIGTERM
=========== 2018-06-10 00:43:31,756 vinodkv inside scripts SIGTERM sent, 
sleeping
=========== 2018-06-10 00:43:36,759 vinodkv inside scripts 3989960 still alive! 
sending sig-kill
=========== 2018-06-10 00:43:36,797 vinodkv inside scripts sigkill sent
=========== 2018-06-10 00:43:36,827 vinodkv inside scripts.. unable to kill 
3989960
=========== 2018-06-10 00:43:36,846 vinodkv inside scripts.. unable to kill 
3989960
=========== 2018-06-10 00:43:36,866 vinodkv inside scripts.. unable to kill 
3989960
=========== 2018-06-10 00:43:36,885 vinodkv inside scripts.. unable to kill 
3989960
=========== 2018-06-10 00:43:36,904 vinodkv inside scripts.. unable to kill 
3989960
=========== 2018-06-10 00:43:36,924 vinodkv inside scripts.. process 3989960 
finally dead
{code}
{code}
=========== 2018-06-10 00:48:00,884 vinodkv inside scripts sending SIGTERM
=========== 2018-06-10 00:48:00,886 vinodkv inside scripts SIGTERM sent, 
sleeping
=========== 2018-06-10 00:48:05,890 vinodkv inside scripts 3992747 still alive! 
sending sig-kill
=========== 2018-06-10 00:48:05,898 vinodkv inside scripts sigkill sent
=========== 2018-06-10 00:48:05,921 vinodkv inside scripts.. unable to kill 
3992747
=========== 2018-06-10 00:48:05,938 vinodkv inside scripts.. unable to kill 
3992747
=========== 2018-06-10 00:48:05,953 vinodkv inside scripts.. unable to kill 
3992747
=========== 2018-06-10 00:48:05,970 vinodkv inside scripts.. unable to kill 
3992747
=========== 2018-06-10 00:48:05,987 vinodkv inside scripts.. unable to kill 
3992747
=========== 2018-06-10 00:48:06,006 vinodkv inside scripts.. unable to kill 
3992747
=========== 2018-06-10 00:48:06,024 vinodkv inside scripts.. unable to kill 
3992747
=========== 2018-06-10 00:48:06,042 vinodkv inside scripts.. process 3992747 
finally dead
{code}

It takes roughly 125-145 milliseconds for RM to come down once a "kill -9" is 
sent.

It is possible that it may be due to system load.

I don't have any other explanation as to why this is only happening now.

> Sometimes daemons keep running even after "kill -9" from daemon-stop script
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-15527
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15527
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Major
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to