[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2016-05-02 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267966#comment-15267966
 ] 

Paul Rogers commented on YARN-3066:
---

This problem persists on Mac OS El Capitan. However, a workaround exists in the 
form of this simple Git project: https://github.com/jerrykuch/ersatz-setsid.

* Install the Apple XCode command line tools.
* Clone the project.
* Build the project using make.
cd ersatz-setsid
make
* Copy the resulting setsid program to /usr/bin:
sudo cp setsid /usr/bin
* Restart YARN.

Now, your process will shut down correctly.

YARN already has included C code. YARN could save us all a ton of grief (took 
me a day to track down this issue to find this bug) by including it's own 
version of setsid.

> Hadoop leaves orphaned tasks running after job is killed
> 
>
> Key: YARN-3066
> URL: https://issues.apache.org/jira/browse/YARN-3066
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
>Reporter: Dmitry Sivachenko
>
> When spawning user task, node manager checks for setsid(1) utility and spawns 
> task program via it. See 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
>  for instance:
> String exec = Shell.isSetsidAvailable? "exec setsid" : "exec";
> FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain "exec" is 
> used to spawn user task.  If that task spawns other external programs (this 
> is common case if a task program is a shell script) and user kills job via 
> mapred job -kill , these child processes remain running.
> 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
> via exec: this is the guarantee to have orphaned processes when job is 
> prematurely killed.
> 2) FreeBSD has a replacement third-party program called ssid (which does 
> almost the same as Linux's setsid).  It would be nice to detect which binary 
> is present during configure stage and put @SETSID@ macros into java file to 
> use the correct name.
> I propose to make Shell.isSetsidAvailable test more strict and fail to start 
> if it is not found:  at least we will know about the problem at start rather 
> than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-05-28 Thread Alan Burlison (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562618#comment-14562618
 ] 

Alan Burlison commented on YARN-3066:
-

Bugtraq is long gone, everything is now in the bug database accessibly via My 
Oracle Support (https://support.oracle.com)

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-05-28 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563012#comment-14563012
 ] 

Allen Wittenauer commented on YARN-3066:


So yes, it's still sealed off without a contract. Meh.

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-05-28 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563011#comment-14563011
 ] 

Allen Wittenauer commented on YARN-3066:


So yes, it's still sealed off without a contract. Meh.

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-05-28 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563010#comment-14563010
 ] 

Allen Wittenauer commented on YARN-3066:


So yes, it's still sealed off without a contract. Meh.

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-05-28 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563009#comment-14563009
 ] 

Allen Wittenauer commented on YARN-3066:


So yes, it's still sealed off without a contract. Meh.

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-05-28 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563013#comment-14563013
 ] 

Allen Wittenauer commented on YARN-3066:


So yes, it's still sealed off without a contract. Meh.

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-05-28 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563008#comment-14563008
 ] 

Allen Wittenauer commented on YARN-3066:


So yes, it's still sealed off without a contract. Meh.

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-05-27 Thread Alan Burlison (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560762#comment-14560762
 ] 

Alan Burlison commented on YARN-3066:
-

As Linux, OSX, Solaris and BSD all support the setsid(2) syscall and it's part 
of POSIX (http://pubs.opengroup.org/onlinepubs/9699919799/toc.htm), isn't a 
better solution just to wrap setsid() + exec() in a little bit of JNI? That 
would avoid the need to install external executables.

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-05-27 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14561118#comment-14561118
 ] 

Allen Wittenauer commented on YARN-3066:


bq. As Linux, OSX, Solaris and BSD all support the setsid(2) syscall and it's 
part of POSIX (http://pubs.opengroup.org/onlinepubs/9699919799/toc.htm), isn't 
a better solution just to wrap setsid() + exec() in a little bit of JNI? That 
would avoid the need to install external executables.

That would break platforms that don't have a working libhadoop (which are 
plentiful).

However, there could be a test here that says if libhadoop is available, use it.

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-05-27 Thread Alan Burlison (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14561130#comment-14561130
 ] 

Alan Burlison commented on YARN-3066:
-

Yes, that's a good point about not every platform having libhadoop. Solaris for 
example has the syscall but not the executable, so in that case it's a better 
solution to use the syscall but that's not always going to be the case.

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-05-27 Thread Dmitry Sivachenko (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14561234#comment-14561234
 ] 

Dmitry Sivachenko commented on YARN-3066:
-

Solaris can use the same ssid program (it is just a simple wrapper for setsid() 
syscall).
I just proposed a simplest fix for that problem.
JNI wrapper sounds like better approach.

What I want to see in any case is the loud error message in case setsid binary 
(or setsid() syscall if we go JNI way) is unavailable.  Right now it pretends 
to work and I spent some time digging out whats going wrong and why I see a lot 
of orphans.

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-05-27 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562114#comment-14562114
 ] 

Allen Wittenauer commented on YARN-3066:


Is Bugtraq+ (if that is what it is still called... haven't been a Sun employee 
for a while...)  still sealed off?

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-01-15 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279202#comment-14279202
 ] 

Chris Nauroth commented on YARN-3066:
-

I'm not familiar with {{ssid}} on FreeBSD.  Does it have the same usage as 
Linux {{setsid}}?  If so, then perhaps an appropriate workaround is to copy 
that binary to {{setsid}} and make sure it's available on the {{PATH}}.  This 
might not require any YARN code changes.

bq. I propose to make Shell.isSetsidAvailable test more strict and fail to 
start if it is not found.

This would likely have to be considered backwards-incompatible, because 
applications would fail to start on existing systems that don't have 
{{setsid}}.  I suppose the new behavior could be hidden behind an opt-in 
configuration property.  Also, we need to keep in mind that 
{{Shell.isSetsidAvailable}} is always {{false}} on Windows.  (On Windows, we 
handle the issue of orphaned processes by using Windows API job objects instead 
of {{setsid}}.)

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed

2015-01-15 Thread Dmitry Sivachenko (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279287#comment-14279287
 ] 

Dmitry Sivachenko commented on YARN-3066:
-

Windows case is tested separately, see private static boolean 
isSetsidSupported() in
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shel
l.java

for instance:

if (Shell.WINDOWS) {
  return false;
}

In any UNIX-like case I suppose it will leave orphaned processes, because if 
isSetsidSupported()==false it uses kill(pid) to kill task instead of kill(pgid) 
to kill the whole process group.

ssid(1) in FreeBSD  is the analog setsid(1) in Linux: userland wrapper for 
setsid() system call.

Renaming does not sound as sane idea, because it is hard to convince all people 
to do rename of installed binaries by hand.

I propose to treat it like system-dependent option and act accordingly.

(I suppose other OS's like Solaris also lack setsid(1) utility so they could 
also benefit).

For ssid source see http://tools.suckless.org/ssid/

As for backwards compatibility we can change that in 3.0, it is not fatal, 
failure to start without setsid will just remind users to install setsid() or 
ssid() and proceed futher, and be sure that there will be no side effects like 
orphaned tasks eating CPU.

 Hadoop leaves orphaned tasks running after job is killed
 

 Key: YARN-3066
 URL: https://issues.apache.org/jira/browse/YARN-3066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
 Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
Reporter: Dmitry Sivachenko

 When spawning user task, node manager checks for setsid(1) utility and spawns 
 task program via it. See 
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
  for instance:
 String exec = Shell.isSetsidAvailable? exec setsid : exec;
 FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain exec is 
 used to spawn user task.  If that task spawns other external programs (this 
 is common case if a task program is a shell script) and user kills job via 
 mapred job -kill Job, these child processes remain running.
 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
 via exec: this is the guarantee to have orphaned processes when job is 
 prematurely killed.
 2) FreeBSD has a replacement third-party program called ssid (which does 
 almost the same as Linux's setsid).  It would be nice to detect which binary 
 is present during configure stage and put @SETSID@ macros into java file to 
 use the correct name.
 I propose to make Shell.isSetsidAvailable test more strict and fail to start 
 if it is not found:  at least we will know about the problem at start rather 
 than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)