+1. very useful feature. We should also provide doc on how to use that
information for tuning.

On Sun, Sep 25, 2016 at 11:27 PM, Thomas Weise <thomas.we...@gmail.com>
wrote:

> +1 very useful during tuning and ongoing monitoring for cost of
> checkpointing (both, serialization and io). Can also be used to identify
> skew.
>
> --
> sent from mobile
> On Sep 25, 2016 9:10 AM, "Munagala Ramanath" <r...@datatorrent.com> wrote:
>
> > We've seen  cases where operator state continues to grow without bound
> > either because
> > the developer was unaware of the importance of keeping state small or
> > because of some
> > anomaly downstream. In such cases, the operators could get killed with an
> > OOM exception because
> > these checkpoints are building up in memory faster than they can be
> written
> > to disk.
> >
> > These stats may be useful in such cases to identify the root cause of
> > failure.
> >
> > Ram
> >
> > On Sun, Sep 25, 2016 at 7:39 AM, Sandesh Hegde <sand...@datatorrent.com>
> > wrote:
> >
> > > Say it takes x MB size and y seconds to do the checkpoint. What does
> the
> > > user do with that information?
> > >
> > > On Sun, Sep 25, 2016, 6:51 AM Tushar Gosavi <tus...@datatorrent.com>
> > > wrote:
> > >
> > > > +1
> > > >
> > > > -Tushar
> > > >
> > > > On Sun, Sep 25, 2016, 8:54 AM Sanjay Pujare <san...@datatorrent.com>
> > > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > Sanjay
> > > > >
> > > > >
> > > > > On Sun, Sep 25, 2016 at 7:06 AM, Devendra Tagare <
> > > > > devend...@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > Thanks,
> > > > > > Dev
> > > > > >
> > > > > > On Sep 25, 2016 1:17 AM, "Pramod Immaneni" <
> pra...@datatorrent.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > > On Sep 24, 2016, at 10:01 AM, Vlad Rozov <
> > > v.ro...@datatorrent.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > IMO, it may be useful to provide checkpoint statistics for
> > > example,
> > > > > > > total size of checkpoint for particular window or average size
> of
> > > > > > > checkpoints for a particular operator. Also, how long it takes
> to
> > > > write
> > > > > > > checkpoints to storage.
> > > > > > > >
> > > > > > > > Thank you,
> > > > > > > >
> > > > > > > > Vlad
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to