I believe that errors on containers are not propagated to the standard “Java” 
logs.

You have to look into the std* and syslog files of the container:

 

Here is an example :

 

.../userlogs/application_1391549207212_0006/container_1391549207212_0006_01_000027

 

[htf@gfldesktop container_1391549207212_0006_01_000027]$ ls -lart

total 60

-rw-rw-r--  1 htf htf     0 Feb  4 17:27 stdout

-rw-rw-r--  1 htf htf     0 Feb  4 17:27 stderr

drwx--x--- 28 htf htf  4096 Feb  4 17:27 ..

drwx--x---  2 htf htf  4096 Feb  4 17:27 .

-rw-rw-r--  1 htf htf 50471 Feb  4 17:31 syslog

 

Regards

./g

 

-----Original Message-----
From: Jay Vyas [mailto:[email protected]] 
Sent: Friday, February 14, 2014 7:02 AM
To: [email protected]
Cc: <[email protected]>
Subject: Re: How to ascertain why LinuxContainer dies?

 

Not sure where the containers dump standard out /error to?  I figured it would 
be propagated in the node manager logs if anywhere, right?

 

Sent from my iPhone

 

> On Feb 14, 2014, at 4:46 AM, Harsh J < <mailto:[email protected]> 
> [email protected]> wrote:

> 

> Hi,

> 

> Does your container command generate any stderr/stdout outputs that 

> you can check under the container's work directory after it fails?

> 

>> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas < <mailto:[email protected]> 
>> [email protected]> wrote:

>> I have a linux container that dies.  The nodemanager logs only say:

>> 

>> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:

>> Exception from container-launch :

>> org.apache.hadoop.util.Shell$ExitCodeException:

>>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)

>>   at org.apache.hadoop.util.Shell.run(Shell.java:129)

>>   at

>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:

>> 322)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.laun

>> chContainer(LinuxContainerExecutor.java:230)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C

>> ontainerLaunch.call(ContainerLaunch.java:242)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C

>> ontainerLaunch.call(ContainerLaunch.java:68)

>>   at 

>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)

>>   at

>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec

>> utor.java:886)

>>   at

>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor

>> .java:908)

>>   at java.lang.Thread.run(Thread.java:662)

>> 

>> where can i find the root cause of the non-zero exit code ?

>> 

>> --

>> Jay Vyas

>>  <http://jayunit100.blogspot.com> http://jayunit100.blogspot.com

> 

> 

> 

> --

> Harsh J

Reply via email to