I believe that errors on containers are not propagated to the standard “Java” logs.
You have to look into the std* and syslog files of the container: Here is an example : .../userlogs/application_1391549207212_0006/container_1391549207212_0006_01_000027 [htf@gfldesktop container_1391549207212_0006_01_000027]$ ls -lart total 60 -rw-rw-r-- 1 htf htf 0 Feb 4 17:27 stdout -rw-rw-r-- 1 htf htf 0 Feb 4 17:27 stderr drwx--x--- 28 htf htf 4096 Feb 4 17:27 .. drwx--x--- 2 htf htf 4096 Feb 4 17:27 . -rw-rw-r-- 1 htf htf 50471 Feb 4 17:31 syslog Regards ./g -----Original Message----- From: Jay Vyas [mailto:[email protected]] Sent: Friday, February 14, 2014 7:02 AM To: [email protected] Cc: <[email protected]> Subject: Re: How to ascertain why LinuxContainer dies? Not sure where the containers dump standard out /error to? I figured it would be propagated in the node manager logs if anywhere, right? Sent from my iPhone > On Feb 14, 2014, at 4:46 AM, Harsh J < <mailto:[email protected]> > [email protected]> wrote: > > Hi, > > Does your container command generate any stderr/stdout outputs that > you can check under the container's work directory after it fails? > >> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas < <mailto:[email protected]> >> [email protected]> wrote: >> I have a linux container that dies. The nodemanager logs only say: >> >> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: >> Exception from container-launch : >> org.apache.hadoop.util.Shell$ExitCodeException: >> at org.apache.hadoop.util.Shell.runCommand(Shell.java:202) >> at org.apache.hadoop.util.Shell.run(Shell.java:129) >> at >> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java: >> 322) >> at >> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.laun >> chContainer(LinuxContainerExecutor.java:230) >> at >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C >> ontainerLaunch.call(ContainerLaunch.java:242) >> at >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C >> ontainerLaunch.call(ContainerLaunch.java:68) >> at >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec >> utor.java:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor >> .java:908) >> at java.lang.Thread.run(Thread.java:662) >> >> where can i find the root cause of the non-zero exit code ? >> >> -- >> Jay Vyas >> <http://jayunit100.blogspot.com> http://jayunit100.blogspot.com > > > > -- > Harsh J
