On Wed, Mar 29, 2017 at 12:16 AM, JingsongLee <[email protected]> wrote:
> If user have a WordCount StatefulDoFn, the result of > counts is always changing before the expiration of window. > Maybe the user want a signal to know the count is the final value > and then archive the value to the timing database or somewhere else. > best, > JingsongLee > This is a good point to bring up, but actually already required to be handled by the runner. This issue exists with timers already. The runner must sequence these: 1. Expire the window and start dropping any more input 2. Fire the user's expiration callback 3. Delete the state for the window This actually made me think of a special property of @OnWindowExpiration: we can forbid Timer parameters. If we followed Robert's idea we could do static analysis and enforce the same thing. This is a pretty good motivation for the special feature. It is more than convenience. Kenn > ------------------------------------------------------------------From:Kenneth > Knowles <[email protected]>Time:2017 Mar 29 (Wed) 09:07To:dev < > [email protected]>Subject:Re: [PROPOSAL] @OnWindowExpiration > On Tue, Mar 28, 2017 at 2:47 PM, Eugene Kirpichov < > [email protected]> wrote: > > > Kenn, can you quote some use cases for this, to make > it more clear what are > > the consequences of having this API in this form? > > > > I recall that one of the main use cases was batching DoFn, right? > > > > I believe every stateful DoFn where the data stored in state represents > some accumulation of the input and/or buffering of output requires this. > So, yes: > > - batching DoFn and the many variants that may spring up > - combine-like stateful DoFns that require state, like blended > accumulation modes or selective composed combines > - trigger-like stateful DoFns that output based on some complex > user-defined criteria > > The stateful DoFns that do not require such a timer are those where the > stored data is a phase transition or side-input-like enrichment, and I > think also common join algorithms. > > I don't have a sense of which of these will be more prevalent. Both > categories represent common user needs. > > Kenn > > > > On Tue, Mar 28, 2017 at 1:37 PM Kenneth Knowles <[email protected]> > > wrote: > > > > > On Tue, Mar 28, 2017 at 1:32 PM, Robert Bradshaw < > > > [email protected]> wrote: > > > > > > > Another alternative is to be able to set special timers, e.g. end of > > > window > > > > and expiration of window. That at least addresses (2). > > > > > > > > > > Potentially a tangent, but that would perhaps fit in with the idea of > > > removing TimeDomain from user APIs ( > > > https://issues.apache.org/jira/browse/BEAM-1308) and instead having > > > TimerSpecs.eventTimeTimer(), TimerSpecs.processingTimeTimer(), > > > TimerSpecs.windowExpirationTimer() that each yield distinct sorts of > > > parameters in @ProcessElement. > > > > > > A bit more heavyweight, syntactically. > > > > > > Kenn > > > > > > > > > > > > > > On Tue, Mar 28, 2017 at 1:27 PM, Kenneth Knowles > > <[email protected] > > > > > > > > wrote: > > > > > > > > > Hi all, > > > > > > > > > > I have a little extension to the stateful DoFn annotations to > > circulate > > > > for > > > > > feedback: Allow a method to be annotated with @ > OnWindowExpiration to > > > > > automatically get a callback at some point after the window has > > > expired, > > > > > but before the state for the window has been cleared. > > > > > > > > > > Today, a user can pretty easily get the same effect by setting a > > timer > > > > for > > > > > the end of the window + allowed lateness in their @ProcessElement > > > calls. > > > > > But having just one annotation for it has a couple nice benefits: > > > > > > > > > > 1. Some users assume a naive implementation so they are concerned > > that > > > > > setting a timer repeatedly is costly. This > eliminates the cause for > > > user > > > > > alarm and allows a runner to do a better job in case it didn't > > already > > > do > > > > > it efficiently. > > > > > > > > > > 2. Getting the allowed lateness to be available to your > > @ProcessElement > > > > is > > > > > a little crufty. > > > > > > > > > > 3. Often, if you don't have @OnWindowExpiration, you are leaving > > behind > > > > > state that might contain data that is otherwise lost. So I would > even > > > > > consider making it mandatory (with some way of > indicating state you > > > don't > > > > > care about dropping) though that could be annoying. > > > > > > > > > > Another interesting moment in a window's > lifecycle is @EndOfWindow. > > > This > > > > is > > > > > not critical for correctness, though. > > > > > > > > > > Thoughts? > > > > > > > > > > Kenn > > > > > > > > > > > > > > > >
