[jira] [Commented] (YARN-76) killApplication doesn't fully kill application master on Mac OS

2015-06-08 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578065#comment-14578065
 ] 

Xuan Gong commented on YARN-76:
---

I think this is duplication with YARN-3561. We already have discussion there. 
Close this as duplication

 killApplication doesn't fully kill application master on Mac OS
 ---

 Key: YARN-76
 URL: https://issues.apache.org/jira/browse/YARN-76
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: Failed on MacOS. OK on Linux
Reporter: Bo Wang
Assignee: Xuan Gong

 When client sends a ClientRMProtocol#killApplication to RM, the corresponding 
 AM is supposed to be killed. However, on Mac OS, the AM is still alive (w/o 
 any interruption).
 I figured out part of the reason after some debugging. NM starts a AM with 
 command like /bin/bash -c /path/to/java SampleAM. This command is executed 
 in a process (say with PID 0001), which starts another Java process (say with 
 PID 0002). When NM kills the AM, it send SIGTERM and then SIGKILL to the bash 
 process (PID 0001). In Linux, the death of the bash process (PID 0001) will 
 trigger the kill of the Java process (PID 0002). However, in Mac OS, only the 
 bash process is killed. The Java process is in the wild since then.
 Note: on Mac OS, DefaultContainerExecutor is used rather than 
 LinuxContainerExecutor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-76) killApplication doesn't fully kill application master on Mac OS

2015-05-08 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14535046#comment-14535046
 ] 

Rohith commented on YARN-76:


[~xgong] Any update on this issue? Do you think issue is still exists in 
branch-2 or trunk?

 killApplication doesn't fully kill application master on Mac OS
 ---

 Key: YARN-76
 URL: https://issues.apache.org/jira/browse/YARN-76
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: Failed on MacOS. OK on Linux
Reporter: Bo Wang
Assignee: Xuan Gong

 When client sends a ClientRMProtocol#killApplication to RM, the corresponding 
 AM is supposed to be killed. However, on Mac OS, the AM is still alive (w/o 
 any interruption).
 I figured out part of the reason after some debugging. NM starts a AM with 
 command like /bin/bash -c /path/to/java SampleAM. This command is executed 
 in a process (say with PID 0001), which starts another Java process (say with 
 PID 0002). When NM kills the AM, it send SIGTERM and then SIGKILL to the bash 
 process (PID 0001). In Linux, the death of the bash process (PID 0001) will 
 trigger the kill of the Java process (PID 0002). However, in Mac OS, only the 
 bash process is killed. The Java process is in the wild since then.
 Note: on Mac OS, DefaultContainerExecutor is used rather than 
 LinuxContainerExecutor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-76) killApplication doesn't fully kill application master on Mac OS

2013-07-31 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725719#comment-13725719
 ] 

Xuan Gong commented on YARN-76:
---

[~bowang] I run the sleep task, and kill this application. Could you tell me 
which command you are running to find out that the application master is not 
killed. Because I run ps -A before and after kill command, and I do not find 
the application master is alive.

 killApplication doesn't fully kill application master on Mac OS
 ---

 Key: YARN-76
 URL: https://issues.apache.org/jira/browse/YARN-76
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: Failed on MacOS. OK on Linux
Reporter: Bo Wang

 When client sends a ClientRMProtocol#killApplication to RM, the corresponding 
 AM is supposed to be killed. However, on Mac OS, the AM is still alive (w/o 
 any interruption).
 I figured out part of the reason after some debugging. NM starts a AM with 
 command like /bin/bash -c /path/to/java SampleAM. This command is executed 
 in a process (say with PID 0001), which starts another Java process (say with 
 PID 0002). When NM kills the AM, it send SIGTERM and then SIGKILL to the bash 
 process (PID 0001). In Linux, the death of the bash process (PID 0001) will 
 trigger the kill of the Java process (PID 0002). However, in Mac OS, only the 
 bash process is killed. The Java process is in the wild since then.
 Note: on Mac OS, DefaultContainerExecutor is used rather than 
 LinuxContainerExecutor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-76) killApplication doesn't fully kill application master on Mac OS

2013-07-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710758#comment-13710758
 ] 

Vinod Kumar Vavilapalli commented on YARN-76:
-

On Linux, we launch commands via a setsid binary. Which is absent on Mac. We 
could try writing our own very simple setsid program or ask Mac users to 
install such a binary from elsewhere.

 killApplication doesn't fully kill application master on Mac OS
 ---

 Key: YARN-76
 URL: https://issues.apache.org/jira/browse/YARN-76
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: Failed on MacOS. OK on Linux
Reporter: Bo Wang

 When client sends a ClientRMProtocol#killApplication to RM, the corresponding 
 AM is supposed to be killed. However, on Mac OS, the AM is still alive (w/o 
 any interruption).
 I figured out part of the reason after some debugging. NM starts a AM with 
 command like /bin/bash -c /path/to/java SampleAM. This command is executed 
 in a process (say with PID 0001), which starts another Java process (say with 
 PID 0002). When NM kills the AM, it send SIGTERM and then SIGKILL to the bash 
 process (PID 0001). In Linux, the death of the bash process (PID 0001) will 
 trigger the kill of the Java process (PID 0002). However, in Mac OS, only the 
 bash process is killed. The Java process is in the wild since then.
 Note: on Mac OS, DefaultContainerExecutor is used rather than 
 LinuxContainerExecutor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-76) killApplication doesn't fully kill application master on Mac OS

2013-07-17 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710762#comment-13710762
 ] 

Chris Nauroth commented on YARN-76:
---

Jira mark-up totally butchered the last comment.  :-)  That was meant to say:

With setsid enabled, we send kill commands with a hypen prepended to the pid to 
indicate a process group, and thus kill the whole process group. Without setsid 
enabled, we don't prepend the hypen...

 killApplication doesn't fully kill application master on Mac OS
 ---

 Key: YARN-76
 URL: https://issues.apache.org/jira/browse/YARN-76
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: Failed on MacOS. OK on Linux
Reporter: Bo Wang

 When client sends a ClientRMProtocol#killApplication to RM, the corresponding 
 AM is supposed to be killed. However, on Mac OS, the AM is still alive (w/o 
 any interruption).
 I figured out part of the reason after some debugging. NM starts a AM with 
 command like /bin/bash -c /path/to/java SampleAM. This command is executed 
 in a process (say with PID 0001), which starts another Java process (say with 
 PID 0002). When NM kills the AM, it send SIGTERM and then SIGKILL to the bash 
 process (PID 0001). In Linux, the death of the bash process (PID 0001) will 
 trigger the kill of the Java process (PID 0002). However, in Mac OS, only the 
 bash process is killed. The Java process is in the wild since then.
 Note: on Mac OS, DefaultContainerExecutor is used rather than 
 LinuxContainerExecutor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-76) killApplication doesn't fully kill application master on Mac OS

2013-07-17 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710761#comment-13710761
 ] 

Chris Nauroth commented on YARN-76:
---

The fact that this only repros on Mac makes me suspect that we're seeing a 
problem related to setsid/process groups.  setsid is available on Linux, but 
it's not available on Mac.

https://github.com/apache/hadoop-common/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java#L304

With setsid enabled, we send kill commands with a '-' prepended to the pid to 
indicate a process group, and thus kill the whole process group.  Without 
setsid enabled, we don't prepend the '-' (no process group), and we only kill 
the single process.  Perhaps Xuan's suggestion to use pkill -P on platforms 
without setsid would work, though I have't researched if pkill is something 
widely available or just present on Mac or just present on certain BSD flavors.

I suspect that we don't have this problem on Windows.  On Windows, we don't 
have setsid, but we do have the concept of process groups using a different 
underlying implementation (Windows job objects).

 killApplication doesn't fully kill application master on Mac OS
 ---

 Key: YARN-76
 URL: https://issues.apache.org/jira/browse/YARN-76
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: Failed on MacOS. OK on Linux
Reporter: Bo Wang

 When client sends a ClientRMProtocol#killApplication to RM, the corresponding 
 AM is supposed to be killed. However, on Mac OS, the AM is still alive (w/o 
 any interruption).
 I figured out part of the reason after some debugging. NM starts a AM with 
 command like /bin/bash -c /path/to/java SampleAM. This command is executed 
 in a process (say with PID 0001), which starts another Java process (say with 
 PID 0002). When NM kills the AM, it send SIGTERM and then SIGKILL to the bash 
 process (PID 0001). In Linux, the death of the bash process (PID 0001) will 
 trigger the kill of the Java process (PID 0002). However, in Mac OS, only the 
 bash process is killed. The Java process is in the wild since then.
 Note: on Mac OS, DefaultContainerExecutor is used rather than 
 LinuxContainerExecutor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-76) killApplication doesn't fully kill application master on Mac OS

2013-07-17 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710816#comment-13710816
 ] 

Xuan Gong commented on YARN-76:
---

{code}  
   public static String[] getSignalKillCommand(int code, String pid) {
return Shell.WINDOWS ? new String[] { Shell.WINUTILS, task, kill, pid } 
:
  new String[] { kill, - + code, isSetsidAvailable ? - + pid : pid };
  }
{code}

If setSid is supported, the command kill -15/-9 -{$pid} will be executed. This 
will kill the whole process group.
If setSid is not supported, the command kill -15/-9 $pid will be executed.

 killApplication doesn't fully kill application master on Mac OS
 ---

 Key: YARN-76
 URL: https://issues.apache.org/jira/browse/YARN-76
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: Failed on MacOS. OK on Linux
Reporter: Bo Wang

 When client sends a ClientRMProtocol#killApplication to RM, the corresponding 
 AM is supposed to be killed. However, on Mac OS, the AM is still alive (w/o 
 any interruption).
 I figured out part of the reason after some debugging. NM starts a AM with 
 command like /bin/bash -c /path/to/java SampleAM. This command is executed 
 in a process (say with PID 0001), which starts another Java process (say with 
 PID 0002). When NM kills the AM, it send SIGTERM and then SIGKILL to the bash 
 process (PID 0001). In Linux, the death of the bash process (PID 0001) will 
 trigger the kill of the Java process (PID 0002). However, in Mac OS, only the 
 bash process is killed. The Java process is in the wild since then.
 Note: on Mac OS, DefaultContainerExecutor is used rather than 
 LinuxContainerExecutor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira