Wangda Tan created HADOOP-12441:
-----------------------------------

             Summary: Fix kill command execution under Ubuntu 12
                 Key: HADOOP-12441
                 URL: https://issues.apache.org/jira/browse/HADOOP-12441
             Project: Hadoop Common
          Issue Type: Bug
            Reporter: Wangda Tan
            Priority: Blocker


After HADOOP-12317, kill command's execution will be failure under Ubuntu12. 
After NM restarts, it cannot get if a process is alive or not via pid of 
containers, and it cannot kill process correctly when RM/AM tells NM to kill a 
container.

Logs from NM (customized logs):
{code}
2015-09-25 21:58:59,348 INFO  nodemanager.DefaultContainerExecutor 
(DefaultContainerExecutor.java:containerIsAlive(431)) -  ================== 
check alive cmd:[[Ljava.lang.String;@496e442d]
2015-09-25 21:58:59,349 INFO  nodemanager.NMAuditLogger 
(NMAuditLogger.java:logSuccess(89)) - USER=hrt_qa       IP=10.0.1.14    
OPERATION=Stop Container Request        TARGET=ContainerManageImpl      
RESULT=SUCCESS  APPID=application_1443218269460_0001    
CONTAINERID=container_1443218269460_0001_01_000001
2015-09-25 21:58:59,363 INFO  nodemanager.DefaultContainerExecutor 
(DefaultContainerExecutor.java:containerIsAlive(438)) -  
===========================
ExitCodeException exitCode=1: ERROR: garbage process ID "--".
Usage:
  kill pid ...              Send SIGTERM to every process listed.
  kill signal pid ...       Send a signal to every process listed.
  kill -s signal pid ...    Send a signal to every process listed.
  kill -l                   List all signal names.
  kill -L                   List all signal names in a nice table.
  kill -l signal            Convert between signal numbers and names.

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:550)
        at org.apache.hadoop.util.Shell.run(Shell.java:461)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:727)
        at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.containerIsAlive(DefaultContainerExecutor.java:432)
        at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.signalContainer(DefaultContainerExecutor.java:401)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:419)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
        at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to