[
https://issues.apache.org/jira/browse/HADOOP-14846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161244#comment-16161244
]
Yuqi Wang edited comment on HADOOP-14846 at 9/11/17 1:38 PM:
-------------------------------------------------------------
[[email protected]]
Yes, exactly. Caller will get exitcode 0, even when the exitcode itself is not
available to use. And one example bug caused by this is
DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212).
Agree with the propagating, however, java.lang.ProcessBuilder.start throws
exception, instead of non zero exit code. And there are cases that we cannot
get any exitcode, such as OOM during the process.
was (Author: yqwang):
[[email protected]]
Yes, exactly. Caller will get exitcode 0, even when the exitcode itself is not
available to use. And one example bug caused by this is
DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212).
Agree with the propagating, however, java.lang.ProcessBuilder.start throws
exception, instead of non zero exit code. And there are cases that we cannot
get any exitcode just exception from the code, such as OOM during the process.
> Wrong shell exit code if the shell process cannot be even started
> -----------------------------------------------------------------
>
> Key: HADOOP-14846
> URL: https://issues.apache.org/jira/browse/HADOOP-14846
> Project: Hadoop Common
> Issue Type: Bug
> Components: util
> Affects Versions: 2.7.1
> Reporter: Yuqi Wang
> Labels: shell
> Fix For: 3.0.0-alpha4
>
> Attachments: HADOOP-14846.001.patch, HADOOP-14846.002.patch
>
>
> *Hadoop may hide shell failures (including start container and fs operation
> failures), such as:*
> Container exit diagnostics (note the container exit code is 0):
> {code:java}
> Exception from container-launch. Container id:
> container_e5620_1503888150197_2979_01_003320
> Exit code: 0
> Exception message:
> Cannot run program "D:\data\hadoop.latest\bin\winutils.exe" (in directory
> "\data\yarnnm\local\usercache\hadoop\appcache\application_1503888150197_2979\container_e5620_1503888150197_2979_01_003320"):
>
> CreateProcess error=2, The system cannot find the file specified
> Stack trace: java.io.IOException: Cannot run program
> "D:\data\hadoop.latest\bin\winutils.exe" (in directory
> "\data\yarnnm\local\usercache\hadoop\appcache\application_1503888150197_2979\container_e5620_1503888150197_2979_01_003320"):
> CreateProcess error=2, The system cannot find the file specified
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:517)
> at org.apache.hadoop.util.Shell.run(Shell.java:490)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:756)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:329)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:86)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]