[ https://issues.apache.org/jira/browse/BEAM-7614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17122769#comment-17122769 ]
Beam JIRA Bot commented on BEAM-7614: ------------------------------------- This issue is P2 but has been unassigned without any comment for 60 days so it has been labeled "stale-P2". If this issue is still affecting you, we care! Please comment and remove the label. Otherwise, in 14 days the issue will be moved to P3. Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed explanation of what these priorities mean. > Event-time timers seem to sometimes fire multiple times on dataflow + > streaming engine > -------------------------------------------------------------------------------------- > > Key: BEAM-7614 > URL: https://issues.apache.org/jira/browse/BEAM-7614 > Project: Beam > Issue Type: Bug > Components: runner-dataflow > Reporter: Steve Niemitz > Priority: P2 > Labels: stale-P2 > > This is kind of hard to reproduce, but I've seen it happen a few times in the > wild now. > We have a DoFn that sets an event-time timer at window.maxTimestamp, the > timer callback does something like: > {code:java} > def onWindowClose( > @StateId(...) key: ValueState[K], > @StateId(...) values: CombiningState[V], > out: OutputReceiver[O], > ... > ) { > > val k = key.read() > val values = values.read() > out.output(KV.of(k, values) > key.clear() > values.clear() > }{code} > Essentially, keep track of the key, accumulate values seen in a window, and > emit them at the end of the window. > ProcessElement is pretty simple as well: > {code:java} > def processElement( > ctx: ProcessContext, > @StateId(...) key: ValueState[K], > @StateId(...) values: CombiningState[V], > ... > ) { > key.write(ctx.element().getKey()) > value.add(ctx.element().getValue()) > timer.set(window.maxTimestamp()) > }{code} > However, *ONLY* when running on streaming engine (this doesn't happen > otherwise), I'll see cases where the onWindowClose timer fires with a null > key, and empty values. > This can only happen if the timer fired twice, since it wouldn't have been > set if no elements had arrived, and if late data had arrived, it would have > set the key (and added to the combining state). Also, we never have late > date in our pipeline. > An interesting other thing I noticed is that these "phantom firings" seem to > happen ~10-15 minutes _AFTER_ the window closes. > Again, its pretty rate, we'll have millions of keys in a window, and I'll > only see the error happen every few hours (with hourly windows). > Let me know if I can clarify anything else! -- This message was sent by Atlassian Jira (v8.3.4#803005)