The call chain is not complete; it ends abruptly with: at java.util.ArrayList.writeObject(ArrayList.java:742) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1495)
We need to see the point of origin. Ram On Mon, Mar 21, 2016 at 10:02 AM, Ganelin, Ilya <ilya.gane...@capitalone.com > wrote: > I uploaded the complete stack trace to the gist in the issue: > https://gist.github.com/ilganeli/7f770374113b40ffa18a > > > > > > On 3/21/16, 9:38 AM, "Munagala Ramanath" <r...@datatorrent.com> wrote: > > >Ilya, could you upload a full stack trace of the failure so we can see > >where the call chain > >originated ? > > > >Ram > > > >On Mon, Mar 21, 2016 at 9:21 AM, Ganelin, Ilya < > ilya.gane...@capitalone.com> > >wrote: > > > >> Chandni- my application fails when launching in YARN, not in local mode. > >> There is no custom partitioning - the code in the example is complete > for > >> both the input and output classes. > >> > >> > >> > >> Sent with Good (www.good.com) > >> ________________________________ > >> From: Chandni Singh <chan...@datatorrent.com> > >> Sent: Monday, March 21, 2016 3:45:46 AM > >> To: dev@apex.incubator.apache.org > >> Subject: Re: Stack overflow errors when launching job > >> > >> > >> debug.zip > >> < > >> > https://drive.google.com/a/datatorrent.com/file/d/0BxX8sOLG8CxHLXFjUjBxM0hIZDg/view?usp=drive_web > >> > > >> Hi Ilya, > >> > >> Attached is the debug application with 20 partitions of input and output > >> operators. I changed the default locality. This application doesn't > fail in > >> local mode. > >> > >> I am using the Stateless Partitioner for both Input and Output. > >> Test configuration is in ApplicationTest and cluster configuration is in > >> my-app-conf1.xml > >> > >> Have you added custom partitioning? They maybe causing the stack > overflow > >> in the app master. > >> > >> Can you modify this application so that the ApplicationTest throws this > >> stack overflow? > >> > >> - Chandni > >> > >> > >> > >> > >> On Sun, Mar 20, 2016 at 11:30 AM, Chandni Singh < > chan...@datatorrent.com> > >> wrote: > >> > >> > Hi Ilya, > >> > As Ram mentioned that we don't know the beginning of the stack track > from > >> > where this is triggered. We can add jvm options in the configuration > file > >> > so that app master is deployed with those configurations. > >> > > >> > Anyways I will look into creating this application (with 20 > partitions) > >> > and run it in local mode to find out where the problem is. > >> > > >> > Will get back to you today or tomorrow. > >> > > >> > Chandni > >> > > >> > On Sun, Mar 20, 2016 at 9:54 AM, Amol Kekre <a...@datatorrent.com> > >> wrote: > >> > > >> >> Can we get on a webex to take a look? > >> >> > >> >> thks > >> >> Amol > >> >> > >> >> > >> >> On Sat, Mar 19, 2016 at 7:27 PM, Ganelin, Ilya < > >> >> ilya.gane...@capitalone.com> > >> >> wrote: > >> >> > >> >> > I don't think I have any time really to connect to the container. > The > >> >> > application launches and crashes almost immediately. Total runtime > is > >> 50 > >> >> > seconds. > >> >> > > >> >> > > >> >> > > >> >> > Sent with Good (www.good.com<http://www.good.com>) > >> >> > ________________________________ > >> >> > From: Munagala Ramanath <r...@datatorrent.com> > >> >> > Sent: Saturday, March 19, 2016 5:39:11 PM > >> >> > To: dev@apex.incubator.apache.org > >> >> > Subject: Re: Stack overflow errors when launching job > >> >> > > >> >> > There is some info here, near the end of the page: > >> >> > > >> >> > http://docs.datatorrent.com/troubleshooting/ > >> >> > > >> >> > under the heading "How do I get a heap dump when a container gets > an > >> >> > OutOfMemoryError ?" > >> >> > > >> >> > However since you're blowing the stack, you may need to manually > run > >> >> jmap > >> >> > on the running container > >> >> > which may be difficult if it doesn't stay up for very long. There > is a > >> >> way > >> >> > to dump the heap programmatically > >> >> > as described, for instance, here: > >> >> > > >> >> > > >> >> > > >> >> > >> > https://blogs.oracle.com/sundararajan/entry/programmatically_dumping_heap_from_java > >> >> > > >> >> > Ram > >> >> > > >> >> > On Sat, Mar 19, 2016 at 2:07 PM, Ganelin, Ilya < > >> >> > ilya.gane...@capitalone.com> > >> >> > wrote: > >> >> > > >> >> > > How would we go about getting a heap dump? > >> >> > > > >> >> > > > >> >> > > > >> >> > > Sent with Good (www.good.com<http://www.good.com< > >> http://www.good.com<http://www.good.com>>) > >> >> > > ________________________________ > >> >> > > From: Yogi Devendra <yogideven...@apache.org> > >> >> > > Sent: Saturday, March 19, 2016 12:19:26 AM > >> >> > > To: dev@apex.incubator.apache.org > >> >> > > Subject: Re: Stack overflow errors when launching job > >> >> > > > >> >> > > Stack trace in the gist shows some symptoms of infinite > recursion. > >> >> > > But, I could not figure out exact cause for it. > >> >> > > > >> >> > > Can you please check your heap dump to see if there are any > cycles > >> in > >> >> the > >> >> > > object hierarchy? > >> >> > > > >> >> > > ~ Yogi > >> >> > > > >> >> > > On 19 March 2016 at 00:36, Ashwin Chandra Putta < > >> >> > ashwinchand...@gmail.com> > >> >> > > wrote: > >> >> > > > >> >> > > > In the example you posted, do you have any locality constraint > >> >> applied? > >> >> > > > > >> >> > > > From what I see, you have two operators - hdfs input operator > and > >> >> hdfs > >> >> > > > output operator. Each of them have 40 partitions each and you > >> don't > >> >> > have > >> >> > > > any other constraints on them. And the partitioner > implementation > >> >> you > >> >> > are > >> >> > > > using is > com.datatorrent.common.partitioner.StatelessPartitioner > >> >> > > > > >> >> > > > Please confirm. > >> >> > > > > >> >> > > > Regards, > >> >> > > > Ashwin. > >> >> > > > > >> >> > > > On Thu, Mar 17, 2016 at 5:00 PM, Ganelin, Ilya < > >> >> > > > ilya.gane...@capitalone.com> > >> >> > > > wrote: > >> >> > > > > >> >> > > > > I’ve updated the gist with a more complete example, and > updated > >> >> the > >> >> > > > > associated JIRA that I’ve created. > >> >> > > > > https://issues.apache.org/jira/browse/APEXCORE-392 > >> >> > > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > On 3/17/16, 4:33 AM, "Tushar Gosavi" <tus...@datatorrent.com > > > >> >> wrote: > >> >> > > > > > >> >> > > > > >Hi, > >> >> > > > > > >> >> > > > > > > >> >> > > > > >I created a sample application with operators from the given > >> >> link. > >> >> > > just > >> >> > > > a > >> >> > > > > >simple input and output and created 32 partitions of each. > >> Could > >> >> not > >> >> > > > > >reproduce the > >> >> > > > > >stack overflow issue. Do you have a small sample application > >> >> which > >> >> > > could > >> >> > > > > >reproduce this issue? > >> >> > > > > > > >> >> > > > > > @Override > >> >> > > > > > public void populateDAG(DAG dag, Configuration > configuration) > >> >> > > > > > { > >> >> > > > > > NewlineFileInputOperator in = dag.addOperator("Input", > new > >> >> > > > > >NewlineFileInputOperator()); > >> >> > > > > > in.setDirectory("/user/tushar/data"); > >> >> > > > > > in.setPartitionCount(32); > >> >> > > > > > > >> >> > > > > > HdfsFileOutputOperator out = dag.addOperator("Output", > new > >> >> > > > > >HdfsFileOutputOperator()); > >> >> > > > > > out.setFilePath("/user/tushar/outdata"); > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > >dag.getMeta(out).getAttributes().put(Context.OperatorContext.PARTITIONER, > >> >> > > > > >new StatelessPartitioner<HdfsFileOutputOperator>(32)); > >> >> > > > > > > >> >> > > > > > dag.addStream("s1", in.output, out.input); > >> >> > > > > > } > >> >> > > > > > > >> >> > > > > >-Tushar. > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > > > >On Thu, Mar 17, 2016 at 12:30 AM, Ganelin, Ilya < > >> >> > > > > ilya.gane...@capitalone.com > >> >> > > > > >> wrote: > >> >> > > > > > > >> >> > > > > >> Hi guys – I’m running into a very frustrating issue where > >> >> certain > >> >> > > DAG > >> >> > > > > >> configurations cause the following error log (attached). > When > >> >> this > >> >> > > > > happens, > >> >> > > > > >> my application even fails to launch. This does not seem to > >> be a > >> >> > YARN > >> >> > > > > issue > >> >> > > > > >> since this occurs even with a relatively small number of > >> >> > > > > partitions/memory. > >> >> > > > > >> > >> >> > > > > >> I’ve attached the input and output operators in question: > >> >> > > > > >> https://gist.github.com/ilganeli/7f770374113b40ffa18a > >> >> > > > > >> > >> >> > > > > >> I can get this to occur predictable by > >> >> > > > > >> > >> >> > > > > >> 1. Increasing the partition count on my input operator > >> >> (reads > >> >> > > from > >> >> > > > > >> HDFS) - values above 20 cause this error > >> >> > > > > >> 2. Increase the partition count on my output operator > >> >> (writes > >> >> > to > >> >> > > > > HDFS) > >> >> > > > > >> - values above 20 cause this error > >> >> > > > > >> 3. Set stream locality from the default to either > thread > >> >> local, > >> >> > > > node > >> >> > > > > >> local, or container_local on the output operator > >> >> > > > > >> > >> >> > > > > >> This behavior is very frustrating as it’s preventing me > from > >> >> > > > > partitioning > >> >> > > > > >> my HDFS I/O appropriately, thus allowing me to scale to > >> higher > >> >> > > > > throughputs. > >> >> > > > > >> > >> >> > > > > >> Do you have any thoughts on what’s going wrong? I would > love > >> >> your > >> >> > > > > feedback. > >> >> > > > > >> ________________________________________________________ > >> >> > > > > >> > >> >> > > > > >> The information contained in this e-mail is confidential > >> and/or > >> >> > > > > >> proprietary to Capital One and/or its affiliates and may > only > >> >> be > >> >> > > used > >> >> > > > > >> solely in performance of work or services for Capital One. > >> The > >> >> > > > > information > >> >> > > > > >> transmitted herewith is intended only for use by the > >> >> individual or > >> >> > > > > entity > >> >> > > > > >> to which it is addressed. If the reader of this message is > >> not > >> >> the > >> >> > > > > intended > >> >> > > > > >> recipient, you are hereby notified that any review, > >> >> > retransmission, > >> >> > > > > >> dissemination, distribution, copying or other use of, or > >> >> taking of > >> >> > > any > >> >> > > > > >> action in reliance upon this information is strictly > >> >> prohibited. > >> >> > If > >> >> > > > you > >> >> > > > > >> have received this communication in error, please contact > the > >> >> > sender > >> >> > > > and > >> >> > > > > >> delete the material from your computer. > >> >> > > > > >> > >> >> > > > > ________________________________________________________ > >> >> > > > > > >> >> > > > > The information contained in this e-mail is confidential > and/or > >> >> > > > > proprietary to Capital One and/or its affiliates and may > only be > >> >> used > >> >> > > > > solely in performance of work or services for Capital One. > The > >> >> > > > information > >> >> > > > > transmitted herewith is intended only for use by the > individual > >> or > >> >> > > entity > >> >> > > > > to which it is addressed. If the reader of this message is > not > >> the > >> >> > > > intended > >> >> > > > > recipient, you are hereby notified that any review, > >> >> retransmission, > >> >> > > > > dissemination, distribution, copying or other use of, or > taking > >> of > >> >> > any > >> >> > > > > action in reliance upon this information is strictly > prohibited. > >> >> If > >> >> > you > >> >> > > > > have received this communication in error, please contact the > >> >> sender > >> >> > > and > >> >> > > > > delete the material from your computer. > >> >> > > > > > >> >> > > > > >> >> > > > > >> >> > > > > >> >> > > > -- > >> >> > > > > >> >> > > > Regards, > >> >> > > > Ashwin. > >> >> > > > > >> >> > > ________________________________________________________ > >> >> > > > >> >> > > The information contained in this e-mail is confidential and/or > >> >> > > proprietary to Capital One and/or its affiliates and may only be > >> used > >> >> > > solely in performance of work or services for Capital One. The > >> >> > information > >> >> > > transmitted herewith is intended only for use by the individual > or > >> >> entity > >> >> > > to which it is addressed. If the reader of this message is not > the > >> >> > intended > >> >> > > recipient, you are hereby notified that any review, > retransmission, > >> >> > > dissemination, distribution, copying or other use of, or taking > of > >> any > >> >> > > action in reliance upon this information is strictly prohibited. > If > >> >> you > >> >> > > have received this communication in error, please contact the > sender > >> >> and > >> >> > > delete the material from your computer. > >> >> > > > >> >> > ________________________________________________________ > >> >> > > >> >> > The information contained in this e-mail is confidential and/or > >> >> > proprietary to Capital One and/or its affiliates and may only be > used > >> >> > solely in performance of work or services for Capital One. The > >> >> information > >> >> > transmitted herewith is intended only for use by the individual or > >> >> entity > >> >> > to which it is addressed. If the reader of this message is not the > >> >> intended > >> >> > recipient, you are hereby notified that any review, retransmission, > >> >> > dissemination, distribution, copying or other use of, or taking of > any > >> >> > action in reliance upon this information is strictly prohibited. If > >> you > >> >> > have received this communication in error, please contact the > sender > >> and > >> >> > delete the material from your computer. > >> >> > > >> >> > >> > > >> > > >> ________________________________________________________ > >> > >> The information contained in this e-mail is confidential and/or > >> proprietary to Capital One and/or its affiliates and may only be used > >> solely in performance of work or services for Capital One. The > information > >> transmitted herewith is intended only for use by the individual or > entity > >> to which it is addressed. If the reader of this message is not the > intended > >> recipient, you are hereby notified that any review, retransmission, > >> dissemination, distribution, copying or other use of, or taking of any > >> action in reliance upon this information is strictly prohibited. If you > >> have received this communication in error, please contact the sender and > >> delete the material from your computer. > >> > ________________________________________________________ > > The information contained in this e-mail is confidential and/or > proprietary to Capital One and/or its affiliates and may only be used > solely in performance of work or services for Capital One. The information > transmitted herewith is intended only for use by the individual or entity > to which it is addressed. If the reader of this message is not the intended > recipient, you are hereby notified that any review, retransmission, > dissemination, distribution, copying or other use of, or taking of any > action in reliance upon this information is strictly prohibited. If you > have received this communication in error, please contact the sender and > delete the material from your computer. >