Hi Arun, I was running on a single node cluster, so all my 100+ containers are on single node. And, the problem is gone when I increased YARN_HEAP_SIZE to 2GB.
Thanks, Kishore On Thu, Aug 1, 2013 at 5:01 AM, Arun C Murthy <[email protected]> wrote: > How many containers are you running per node? > > On Jul 25, 2013, at 5:21 AM, Krishna Kishore Bonagiri < > [email protected]> wrote: > > Hi Devaraj, > > I used to run this application with the same number of containers > successfully on previous version, i.e. hadoop-2.0.4-alpha. Is it failing > with the new version, because YARN itself is also adding some more threads > than the previous versions? > > Thanks, > Kishore > > > On Thu, Jul 25, 2013 at 4:24 PM, Devaraj k <[email protected]> wrote: > >> Hi Kishore,**** >> >> ** ** >> >> It seems that system doesn’t have enough resources to launch a new >> thread. **** >> >> ** ** >> >> Could you check the system is affordable to launch the configured >> containers and try increasing the native memory available in the system by >> reducing the no of running processes in the system.**** >> >> ** ** >> >> Thanks**** >> >> Devaraj k**** >> >> ** ** >> >> *From:* Krishna Kishore Bonagiri [mailto:[email protected]] >> *Sent:* 25 July 2013 16:09 >> *To:* [email protected] >> *Subject:* Node manager crashing when running an app requiring 100 >> containers on hadoop-2.1.0-beta RC0**** >> >> ** ** >> >> Hi,**** >> >> ** ** >> >> I am running an application against hadoop-2.1.0-beta RC, and my app >> requires 117 containers, I have got all the containers allocated, but while >> starting those containers, at around 99th container the node manager has >> gone down with the following kind of error in it's log. Also, I could >> reproduce this error running a "sleep 200; date" command using the >> Distributed Shell example, in which case I got this error at around 66th >> container.**** >> >> ** ** >> >> ** ** >> >> 2013-07-25 06:07:17,743 FATAL >> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[process >> reaper,5,main] threw an Error. Shutting down now...**** >> >> java.lang.OutOfMemoryError: Failed to create a thread: retVal >> -1073741830, errno 11**** >> >> at java.lang.Thread.startImpl(Native Method)**** >> >> at java.lang.Thread.start(Thread.java:887)**** >> >> at java.lang.ProcessInputStream.<init>(UNIXProcess.java:472)**** >> >> at java.lang.UNIXProcess$1$1$1.run(UNIXProcess.java:157)**** >> >> at >> java.security.AccessController.doPrivileged(AccessController.java:202)*** >> * >> >> at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:137)**** >> >> 2013-07-25 06:07:17,745 INFO org.apache.hadoop.util.ExitUtil: Halt with >> status -1 Message: HaltException**** >> >> ** ** >> >> Thanks,**** >> >> Kishore**** >> > > > -- > Arun C. Murthy > Hortonworks Inc. > http://hortonworks.com/ > > >
