Ilya, could you upload a full stack trace of the failure so we can see where the call chain originated ?
Ram On Mon, Mar 21, 2016 at 9:21 AM, Ganelin, Ilya <ilya.gane...@capitalone.com> wrote: > Chandni- my application fails when launching in YARN, not in local mode. > There is no custom partitioning - the code in the example is complete for > both the input and output classes. > > > > Sent with Good (www.good.com) > ________________________________ > From: Chandni Singh <chan...@datatorrent.com> > Sent: Monday, March 21, 2016 3:45:46 AM > To: dev@apex.incubator.apache.org > Subject: Re: Stack overflow errors when launching job > > > debug.zip > < > https://drive.google.com/a/datatorrent.com/file/d/0BxX8sOLG8CxHLXFjUjBxM0hIZDg/view?usp=drive_web > > > Hi Ilya, > > Attached is the debug application with 20 partitions of input and output > operators. I changed the default locality. This application doesn't fail in > local mode. > > I am using the Stateless Partitioner for both Input and Output. > Test configuration is in ApplicationTest and cluster configuration is in > my-app-conf1.xml > > Have you added custom partitioning? They maybe causing the stack overflow > in the app master. > > Can you modify this application so that the ApplicationTest throws this > stack overflow? > > - Chandni > > > > > On Sun, Mar 20, 2016 at 11:30 AM, Chandni Singh <chan...@datatorrent.com> > wrote: > > > Hi Ilya, > > As Ram mentioned that we don't know the beginning of the stack track from > > where this is triggered. We can add jvm options in the configuration file > > so that app master is deployed with those configurations. > > > > Anyways I will look into creating this application (with 20 partitions) > > and run it in local mode to find out where the problem is. > > > > Will get back to you today or tomorrow. > > > > Chandni > > > > On Sun, Mar 20, 2016 at 9:54 AM, Amol Kekre <a...@datatorrent.com> > wrote: > > > >> Can we get on a webex to take a look? > >> > >> thks > >> Amol > >> > >> > >> On Sat, Mar 19, 2016 at 7:27 PM, Ganelin, Ilya < > >> ilya.gane...@capitalone.com> > >> wrote: > >> > >> > I don't think I have any time really to connect to the container. The > >> > application launches and crashes almost immediately. Total runtime is > 50 > >> > seconds. > >> > > >> > > >> > > >> > Sent with Good (www.good.com<http://www.good.com>) > >> > ________________________________ > >> > From: Munagala Ramanath <r...@datatorrent.com> > >> > Sent: Saturday, March 19, 2016 5:39:11 PM > >> > To: dev@apex.incubator.apache.org > >> > Subject: Re: Stack overflow errors when launching job > >> > > >> > There is some info here, near the end of the page: > >> > > >> > http://docs.datatorrent.com/troubleshooting/ > >> > > >> > under the heading "How do I get a heap dump when a container gets an > >> > OutOfMemoryError ?" > >> > > >> > However since you're blowing the stack, you may need to manually run > >> jmap > >> > on the running container > >> > which may be difficult if it doesn't stay up for very long. There is a > >> way > >> > to dump the heap programmatically > >> > as described, for instance, here: > >> > > >> > > >> > > >> > https://blogs.oracle.com/sundararajan/entry/programmatically_dumping_heap_from_java > >> > > >> > Ram > >> > > >> > On Sat, Mar 19, 2016 at 2:07 PM, Ganelin, Ilya < > >> > ilya.gane...@capitalone.com> > >> > wrote: > >> > > >> > > How would we go about getting a heap dump? > >> > > > >> > > > >> > > > >> > > Sent with Good (www.good.com<http://www.good.com< > http://www.good.com<http://www.good.com>>) > >> > > ________________________________ > >> > > From: Yogi Devendra <yogideven...@apache.org> > >> > > Sent: Saturday, March 19, 2016 12:19:26 AM > >> > > To: dev@apex.incubator.apache.org > >> > > Subject: Re: Stack overflow errors when launching job > >> > > > >> > > Stack trace in the gist shows some symptoms of infinite recursion. > >> > > But, I could not figure out exact cause for it. > >> > > > >> > > Can you please check your heap dump to see if there are any cycles > in > >> the > >> > > object hierarchy? > >> > > > >> > > ~ Yogi > >> > > > >> > > On 19 March 2016 at 00:36, Ashwin Chandra Putta < > >> > ashwinchand...@gmail.com> > >> > > wrote: > >> > > > >> > > > In the example you posted, do you have any locality constraint > >> applied? > >> > > > > >> > > > From what I see, you have two operators - hdfs input operator and > >> hdfs > >> > > > output operator. Each of them have 40 partitions each and you > don't > >> > have > >> > > > any other constraints on them. And the partitioner implementation > >> you > >> > are > >> > > > using is com.datatorrent.common.partitioner.StatelessPartitioner > >> > > > > >> > > > Please confirm. > >> > > > > >> > > > Regards, > >> > > > Ashwin. > >> > > > > >> > > > On Thu, Mar 17, 2016 at 5:00 PM, Ganelin, Ilya < > >> > > > ilya.gane...@capitalone.com> > >> > > > wrote: > >> > > > > >> > > > > I’ve updated the gist with a more complete example, and updated > >> the > >> > > > > associated JIRA that I’ve created. > >> > > > > https://issues.apache.org/jira/browse/APEXCORE-392 > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > On 3/17/16, 4:33 AM, "Tushar Gosavi" <tus...@datatorrent.com> > >> wrote: > >> > > > > > >> > > > > >Hi, > >> > > > > > >> > > > > > > >> > > > > >I created a sample application with operators from the given > >> link. > >> > > just > >> > > > a > >> > > > > >simple input and output and created 32 partitions of each. > Could > >> not > >> > > > > >reproduce the > >> > > > > >stack overflow issue. Do you have a small sample application > >> which > >> > > could > >> > > > > >reproduce this issue? > >> > > > > > > >> > > > > > @Override > >> > > > > > public void populateDAG(DAG dag, Configuration configuration) > >> > > > > > { > >> > > > > > NewlineFileInputOperator in = dag.addOperator("Input", new > >> > > > > >NewlineFileInputOperator()); > >> > > > > > in.setDirectory("/user/tushar/data"); > >> > > > > > in.setPartitionCount(32); > >> > > > > > > >> > > > > > HdfsFileOutputOperator out = dag.addOperator("Output", new > >> > > > > >HdfsFileOutputOperator()); > >> > > > > > out.setFilePath("/user/tushar/outdata"); > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >dag.getMeta(out).getAttributes().put(Context.OperatorContext.PARTITIONER, > >> > > > > >new StatelessPartitioner<HdfsFileOutputOperator>(32)); > >> > > > > > > >> > > > > > dag.addStream("s1", in.output, out.input); > >> > > > > > } > >> > > > > > > >> > > > > >-Tushar. > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > >On Thu, Mar 17, 2016 at 12:30 AM, Ganelin, Ilya < > >> > > > > ilya.gane...@capitalone.com > >> > > > > >> wrote: > >> > > > > > > >> > > > > >> Hi guys – I’m running into a very frustrating issue where > >> certain > >> > > DAG > >> > > > > >> configurations cause the following error log (attached). When > >> this > >> > > > > happens, > >> > > > > >> my application even fails to launch. This does not seem to > be a > >> > YARN > >> > > > > issue > >> > > > > >> since this occurs even with a relatively small number of > >> > > > > partitions/memory. > >> > > > > >> > >> > > > > >> I’ve attached the input and output operators in question: > >> > > > > >> https://gist.github.com/ilganeli/7f770374113b40ffa18a > >> > > > > >> > >> > > > > >> I can get this to occur predictable by > >> > > > > >> > >> > > > > >> 1. Increasing the partition count on my input operator > >> (reads > >> > > from > >> > > > > >> HDFS) - values above 20 cause this error > >> > > > > >> 2. Increase the partition count on my output operator > >> (writes > >> > to > >> > > > > HDFS) > >> > > > > >> - values above 20 cause this error > >> > > > > >> 3. Set stream locality from the default to either thread > >> local, > >> > > > node > >> > > > > >> local, or container_local on the output operator > >> > > > > >> > >> > > > > >> This behavior is very frustrating as it’s preventing me from > >> > > > > partitioning > >> > > > > >> my HDFS I/O appropriately, thus allowing me to scale to > higher > >> > > > > throughputs. > >> > > > > >> > >> > > > > >> Do you have any thoughts on what’s going wrong? I would love > >> your > >> > > > > feedback. > >> > > > > >> ________________________________________________________ > >> > > > > >> > >> > > > > >> The information contained in this e-mail is confidential > and/or > >> > > > > >> proprietary to Capital One and/or its affiliates and may only > >> be > >> > > used > >> > > > > >> solely in performance of work or services for Capital One. > The > >> > > > > information > >> > > > > >> transmitted herewith is intended only for use by the > >> individual or > >> > > > > entity > >> > > > > >> to which it is addressed. If the reader of this message is > not > >> the > >> > > > > intended > >> > > > > >> recipient, you are hereby notified that any review, > >> > retransmission, > >> > > > > >> dissemination, distribution, copying or other use of, or > >> taking of > >> > > any > >> > > > > >> action in reliance upon this information is strictly > >> prohibited. > >> > If > >> > > > you > >> > > > > >> have received this communication in error, please contact the > >> > sender > >> > > > and > >> > > > > >> delete the material from your computer. > >> > > > > >> > >> > > > > ________________________________________________________ > >> > > > > > >> > > > > The information contained in this e-mail is confidential and/or > >> > > > > proprietary to Capital One and/or its affiliates and may only be > >> used > >> > > > > solely in performance of work or services for Capital One. The > >> > > > information > >> > > > > transmitted herewith is intended only for use by the individual > or > >> > > entity > >> > > > > to which it is addressed. If the reader of this message is not > the > >> > > > intended > >> > > > > recipient, you are hereby notified that any review, > >> retransmission, > >> > > > > dissemination, distribution, copying or other use of, or taking > of > >> > any > >> > > > > action in reliance upon this information is strictly prohibited. > >> If > >> > you > >> > > > > have received this communication in error, please contact the > >> sender > >> > > and > >> > > > > delete the material from your computer. > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > > -- > >> > > > > >> > > > Regards, > >> > > > Ashwin. > >> > > > > >> > > ________________________________________________________ > >> > > > >> > > The information contained in this e-mail is confidential and/or > >> > > proprietary to Capital One and/or its affiliates and may only be > used > >> > > solely in performance of work or services for Capital One. The > >> > information > >> > > transmitted herewith is intended only for use by the individual or > >> entity > >> > > to which it is addressed. If the reader of this message is not the > >> > intended > >> > > recipient, you are hereby notified that any review, retransmission, > >> > > dissemination, distribution, copying or other use of, or taking of > any > >> > > action in reliance upon this information is strictly prohibited. If > >> you > >> > > have received this communication in error, please contact the sender > >> and > >> > > delete the material from your computer. > >> > > > >> > ________________________________________________________ > >> > > >> > The information contained in this e-mail is confidential and/or > >> > proprietary to Capital One and/or its affiliates and may only be used > >> > solely in performance of work or services for Capital One. The > >> information > >> > transmitted herewith is intended only for use by the individual or > >> entity > >> > to which it is addressed. If the reader of this message is not the > >> intended > >> > recipient, you are hereby notified that any review, retransmission, > >> > dissemination, distribution, copying or other use of, or taking of any > >> > action in reliance upon this information is strictly prohibited. If > you > >> > have received this communication in error, please contact the sender > and > >> > delete the material from your computer. > >> > > >> > > > > > ________________________________________________________ > > The information contained in this e-mail is confidential and/or > proprietary to Capital One and/or its affiliates and may only be used > solely in performance of work or services for Capital One. The information > transmitted herewith is intended only for use by the individual or entity > to which it is addressed. If the reader of this message is not the intended > recipient, you are hereby notified that any review, retransmission, > dissemination, distribution, copying or other use of, or taking of any > action in reliance upon this information is strictly prohibited. If you > have received this communication in error, please contact the sender and > delete the material from your computer. >