+1. very useful feature. We should also provide doc on how to use that information for tuning.
On Sun, Sep 25, 2016 at 11:27 PM, Thomas Weise <thomas.we...@gmail.com> wrote: > +1 very useful during tuning and ongoing monitoring for cost of > checkpointing (both, serialization and io). Can also be used to identify > skew. > > -- > sent from mobile > On Sep 25, 2016 9:10 AM, "Munagala Ramanath" <r...@datatorrent.com> wrote: > > > We've seen cases where operator state continues to grow without bound > > either because > > the developer was unaware of the importance of keeping state small or > > because of some > > anomaly downstream. In such cases, the operators could get killed with an > > OOM exception because > > these checkpoints are building up in memory faster than they can be > written > > to disk. > > > > These stats may be useful in such cases to identify the root cause of > > failure. > > > > Ram > > > > On Sun, Sep 25, 2016 at 7:39 AM, Sandesh Hegde <sand...@datatorrent.com> > > wrote: > > > > > Say it takes x MB size and y seconds to do the checkpoint. What does > the > > > user do with that information? > > > > > > On Sun, Sep 25, 2016, 6:51 AM Tushar Gosavi <tus...@datatorrent.com> > > > wrote: > > > > > > > +1 > > > > > > > > -Tushar > > > > > > > > On Sun, Sep 25, 2016, 8:54 AM Sanjay Pujare <san...@datatorrent.com> > > > > wrote: > > > > > > > > > +1 > > > > > > > > > > Sanjay > > > > > > > > > > > > > > > On Sun, Sep 25, 2016 at 7:06 AM, Devendra Tagare < > > > > > devend...@datatorrent.com> > > > > > wrote: > > > > > > > > > > > +1 > > > > > > > > > > > > Thanks, > > > > > > Dev > > > > > > > > > > > > On Sep 25, 2016 1:17 AM, "Pramod Immaneni" < > pra...@datatorrent.com > > > > > > > > wrote: > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > > On Sep 24, 2016, at 10:01 AM, Vlad Rozov < > > > v.ro...@datatorrent.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > IMO, it may be useful to provide checkpoint statistics for > > > example, > > > > > > > total size of checkpoint for particular window or average size > of > > > > > > > checkpoints for a particular operator. Also, how long it takes > to > > > > write > > > > > > > checkpoints to storage. > > > > > > > > > > > > > > > > Thank you, > > > > > > > > > > > > > > > > Vlad > > > > > > > > > > > > > > > > > > > > > > > > > > > >