Just curious, was that write happening in one of the operator callbacks ? Ram
On Wed, Mar 1, 2017 at 9:44 AM, Sunil Parmar <[email protected]> wrote: > I think we figured the issue. It was the Cassandra ; in that environment > one of the node was making write super slow. We fixed the cluster and now > it’s much better. > > On 2017-02-28 13:09 (-0800), Sandesh Hegde <[email protected]> > wrote: > > Can you please attach the stacktrace of the operator? > > > > You can increase the attribute TIMEOUT_WINDOW_COUNT , AppMaster uses that > > to decide when to kill the blocked operator. > > > > For taking stack trace, find the information in the blog. > > https://www.datatorrent.com/blog/getting-stack-traces- > apache-apex-applications/ > > > > On Tue, Feb 28, 2017 at 12:59 PM Sunil Parmar <[email protected]> > > wrote: > > > > > Ashwin, > > > I don%u2019t see such warning. I%u2019ll PM you entire log file. > > > > > > On 2017-02-28 12:16 (-0800), Ashwin Chandra Putta < > > > [email protected]> wrote: > > > > Sunil, > > > > This might be related to checkpointing. See: > > > > > > > https://github.com/apache/apex-core/blob/master/engine/ > src/main/java/com/datatorrent/stram/StreamingContainerManager. > java#L2211-L2217 > > > > > > > > Also check this piece of code: > > > > > > > https://github.com/apache/apex-core/blob/master/engine/ > src/main/java/com/datatorrent/stram/StreamingContainerManager. > java#L2031-L2044 > > > > > > > > Can you paste the output of the warning from the code above which > starts > > > > with 'Marking operator ' > > > > > > > > Regards, > > > > Ashwin. > > > > > > > > On Tue, Feb 28, 2017 at 12:03 PM, Sunil Parmar < > [email protected] > > > > > > > > wrote: > > > > > > > > > That doesn%u2019t seems to be the case. We do see window id moving > in > > > UI as > > > > > well. > > > > > > > > > > On 2017-02-28 11:19 (-0800), Munagala Ramanath < > [email protected]> > > > > > wrote: > > > > > > It likely means that that operator is taking too long to return > from > > > one > > > > > of > > > > > > the callbacks like beginWindow(), endWindow(), > > > > > > emitTuples(), etc. Do you have any potentially blocking calls to > > > external > > > > > > systems in any of those callbacks ? > > > > > > > > > > > > Ram > > > > > > > > > > > > On Tue, Feb 28, 2017 at 11:09 AM, Sunil Parmar < > > > [email protected] > > > > > > > > > > > > wrote: > > > > > > > > > > > > > 2017-02-27 19:43:21,926 INFO com.datatorrent.stram. > > > > > StreamingContainerManager: > > > > > > > Blocked operator PTOperator[id=3,name=eventUpdatesFormatter] > > > container > > > > > > > > > > PTContainer[id=1(container_1487310232732_0027_02_000111),state=ACTIVE] > > > > > > > time 61905ms > > > > > > > 2017-02-27 19:43:22,928 INFO com.datatorrent.stram. > > > > > StreamingAppMasterService: > > > > > > > Completed containerId=container_1487310232732_0027_02_000111, > > > > > > > state=COMPLETE, exitStatus=-105, diagnostics=Container killed > by > > > the > > > > > > > ApplicationMaster. > > > > > > > Container killed on request. Exit code is 143 > > > > > > > Container exited with a non-zero exit code 143 > > > > > > > > > > > > > > > > > > > > > Can anyone help understand this error ? We see one of the > operators > > > > > keeps > > > > > > > restarting the container; the above error is from AppMaster > log. > > > > > > > > > > > > > > Thanks, > > > > > > > Sunil > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > _______________________________________________________ > > > > > > > > > > > > Munagala V. Ramanath > > > > > > > > > > > > Software Engineer > > > > > > > > > > > > E: [email protected] | M: (408) 331-5034 | Twitter: > @UnknownRam > > > > > > > > > > > > www.datatorrent.com | apex.apache.org > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Regards, > > > > Ashwin. > > > > > > > > > -- > > *Join us at Apex Big Data World-San Jose > > <http://www.apexbigdata.com/san-jose.html>, April 4, 2017!* > > [image: http://www.apexbigdata.com/san-jose-register.html] > > > -- _______________________________________________________ Munagala V. Ramanath Software Engineer E: [email protected] | M: (408) 331-5034 | Twitter: @UnknownRam www.datatorrent.com | apex.apache.org
