[
https://issues.apache.org/jira/browse/MAPREDUCE-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270583#comment-13270583
]
Devaraj K commented on MAPREDUCE-4031:
--------------------------------------
When we do NM shutdown, as part of NM shutdown, it acquires the class lock on
java.lang.Shutdown and starts executing shut down hook for NM. It invokes the
nm.stop() and which will stop all the other services. When it tries to stop
AsyncDispatcher, AsyncDispatcher will wait for eventHandlingThread to join.
After starts executing shutdown hook and before come to wait for join on
eventHandlingThread, if any exception occurs on eventHandlingThread, it will
execute System.exit() in the catch block. This system.exit() in catch tries to
acquire class lock java.lang.Shutdown which was holding by shutdown hook thread
and waits forever. Shutdown hook thread also waits forever to join
eventHandlingThread.
> Node Manager hangs on shut down
> -------------------------------
>
> Key: MAPREDUCE-4031
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4031
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2, nodemanager
> Affects Versions: 0.23.2, 2.0.0, 3.0.0
> Reporter: Devaraj K
> Assignee: Devaraj K
> Priority: Critical
> Attachments: nm-threaddump.out
>
>
> I have the MAPREDUCE-3862 changes which fixed this issue earlier and
> "yarn.nodemanager.delete.debug-delay-sec" set to default value but still
> getting this issue.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira