[
https://issues.apache.org/jira/browse/MAPREDUCE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196317#comment-13196317
]
Ramya Sunil commented on MAPREDUCE-3746:
----------------------------------------
Hi Devaraj,
I am able to reproduce the issue consistently. I have 6 NMs and I decommission
all of them while I have a few apps still running. Below are the last few lines
of the NM logs which failed to shutdown completely:
{noformat}
2012-01-30 18:53:46,786 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Public cache exiting
2012-01-30 18:53:46,786 INFO org.apache.hadoop.yarn.service.AbstractService:
Service:org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker
is stopped.
2012-01-30 18:53:46,787 INFO org.apache.hadoop.yarn.service.AbstractService:
Service:org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService
is stopped.
2012-01-30 18:53:46,786 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
Failed to launch container.
java.io.IOException: java.lang.InterruptedException
at org.apache.hadoop.util.Shell.runCommand(Shell.java:264)
at org.apache.hadoop.util.Shell.run(Shell.java:188)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:381)
at
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:207)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:241)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2012-01-30 18:53:46,787 INFO org.apache.hadoop.yarn.service.AbstractService:
Service:org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl
is stopped.
2012-01-30 18:53:46,788 INFO org.apache.hadoop.yarn.service.AbstractService:
Service:org.apache.hadoop.yarn.server.nodemanager.NodeResourceMonitorImpl is
stopped.
2012-01-30 18:53:46,788 INFO org.apache.hadoop.yarn.service.AbstractService:
Service:org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService is
stopped.
2012-01-30 18:53:46,788 INFO org.apache.hadoop.yarn.service.AbstractService:
Service:org.apache.hadoop.yarn.server.nodemanager.NodeHealthCheckerService is
stopped.
2012-01-30 18:53:56,791 INFO org.apache.hadoop.yarn.service.AbstractService:
Service:org.apache.hadoop.yarn.server.nodemanager.DeletionService is stopped.
2012-01-30 18:53:56,792 INFO org.apache.hadoop.yarn.service.AbstractService:
Service:org.apache.hadoop.yarn.server.nodemanager.NodeManager is stopped.
2012-01-30 18:53:56,792 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
Stopping NodeManager metrics system...
2012-01-30 18:53:56,792 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
NodeManager metrics system stopped.
2012-01-30 18:53:56,792 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
NodeManager metrics system shutdown complete.
2012-01-30 18:53:56,793 INFO org.apache.hadoop.yarn.service.AbstractService:
Service:org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl is
stopped.
{noformat}
> Nodemanagers are not automatically shut down after decommissioning
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-3746
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3746
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 0.23.1
> Reporter: Ramya Sunil
> Assignee: Devaraj K
> Priority: Critical
> Fix For: 0.23.1
>
>
> Nodemanagers are not automatically shutdown after decommissioning.
> MAPREDUCE-2775 does not seem to fix the issue.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira