Hi Devaraj, I used to run this application with the same number of containers successfully on previous version, i.e. hadoop-2.0.4-alpha. Is it failing with the new version, because YARN itself is also adding some more threads than the previous versions?
Thanks, Kishore On Thu, Jul 25, 2013 at 4:24 PM, Devaraj k <devara...@huawei.com> wrote: > Hi Kishore,**** > > ** ** > > It seems that system doesn’t have enough resources to launch a new thread. > **** > > ** ** > > Could you check the system is affordable to launch the configured > containers and try increasing the native memory available in the system by > reducing the no of running processes in the system.**** > > ** ** > > Thanks**** > > Devaraj k**** > > ** ** > > *From:* Krishna Kishore Bonagiri [mailto:write2kish...@gmail.com] > *Sent:* 25 July 2013 16:09 > *To:* user@hadoop.apache.org > *Subject:* Node manager crashing when running an app requiring 100 > containers on hadoop-2.1.0-beta RC0**** > > ** ** > > Hi,**** > > ** ** > > I am running an application against hadoop-2.1.0-beta RC, and my app > requires 117 containers, I have got all the containers allocated, but while > starting those containers, at around 99th container the node manager has > gone down with the following kind of error in it's log. Also, I could > reproduce this error running a "sleep 200; date" command using the > Distributed Shell example, in which case I got this error at around 66th > container.**** > > ** ** > > ** ** > > 2013-07-25 06:07:17,743 FATAL > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[process > reaper,5,main] threw an Error. Shutting down now...**** > > java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, > errno 11**** > > at java.lang.Thread.startImpl(Native Method)**** > > at java.lang.Thread.start(Thread.java:887)**** > > at java.lang.ProcessInputStream.<init>(UNIXProcess.java:472)**** > > at java.lang.UNIXProcess$1$1$1.run(UNIXProcess.java:157)**** > > at > java.security.AccessController.doPrivileged(AccessController.java:202)**** > > at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:137)**** > > 2013-07-25 06:07:17,745 INFO org.apache.hadoop.util.ExitUtil: Halt with > status -1 Message: HaltException**** > > ** ** > > Thanks,**** > > Kishore**** >