[ https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208601#comment-13208601 ]
Jason Lowe commented on MAPREDUCE-3862: --------------------------------------- DeletionService has the following code which implies we don't want to wait too long for the shutdown to complete: {code} public void stop() { sched.shutdown(); try { sched.awaitTermination(10, SECONDS); } catch (InterruptedException e) { sched.shutdownNow(); } super.stop(); } {code} However the code never checks the result from {{awaitTermination()}}, and we can end up trying to continue the shutdown process with the thread pool still active. > Nodemanager can appear to hang on shutdown due to lingering DeletionService > threads > ----------------------------------------------------------------------------------- > > Key: MAPREDUCE-3862 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2, nodemanager > Affects Versions: 0.23.1 > Reporter: Jason Lowe > > When a nodemanager attempts to shutdown cleanly, it's possible for it to > appear to hang due to lingering DeletionService threads. This can occur when > yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value > and one or more containers executes on the node shortly before the shutdown. > The DeletionService is never calling > {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the > ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled > tasks to complete before exiting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira