[ 
https://issues.apache.org/jira/browse/HADOOP-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15715954#comment-15715954
 ] 

Jason Lowe commented on HADOOP-13709:
-------------------------------------

Thanks for updating the patch!  Synchronization changes look good.

Thinking about the patch further, I believe this will break YARN nodemanager 
work-preserving restart.  Currently the nodemanager does not kill the 
subprocesses when work-preserving restart is enabled, but this 
kill-all-on-shutdown feature will do it anyway.  Therefore minimally I think we 
need to change it so the shell is capable of tracking shell processes but 
doesn't always kill them on shutdown.  Anything that needs to kill things on 
shutdown (i.e.: the YARN localizer problematic case that caused this to be 
filed) can register their own shutdown hook to call Shell.destroyAllProcesses.  
Since this interface will be public, it would be good to provide some javadoc 
for it.


> Clean up subprocesses spawned by Shell.java:runCommand when the shell process 
> exits
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-13709
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13709
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>            Reporter: Eric Badger
>            Assignee: Eric Badger
>         Attachments: HADOOP-13709.001.patch, HADOOP-13709.002.patch, 
> HADOOP-13709.003.patch, HADOOP-13709.004.patch, HADOOP-13709.005.patch, 
> HADOOP-13709.006.patch, HADOOP-13709.007.patch, HADOOP-13709.008.patch
>
>
> The runCommand code in Shell.java can get into a situation where it will 
> ignore InterruptedExceptions and refuse to shutdown due to being in I/O 
> waiting for the return value of the subprocess that was spawned. We need to 
> allow for the subprocess to be interrupted and killed when the shell process 
> gets killed. Currently the JVM will shutdown and all of the subprocesses will 
> be orphaned and not killed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to