[
https://issues.apache.org/jira/browse/BEAM-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Reuven Lax reassigned BEAM-2535:
--------------------------------
Assignee: Shehzaad Nakhoda (was: Batkhuyag Batsaikhan)
> Allow explicit output time independent of firing specification for all timers
> -----------------------------------------------------------------------------
>
> Key: BEAM-2535
> URL: https://issues.apache.org/jira/browse/BEAM-2535
> Project: Beam
> Issue Type: New Feature
> Components: beam-model, sdk-java-core
> Reporter: Kenneth Knowles
> Assignee: Shehzaad Nakhoda
> Priority: Major
> Time Spent: 3h 50m
> Remaining Estimate: 0h
>
> Today, we have insufficient control over the event time timestamp of elements
> output from a timer callback.
> 1. For an event time timer, it is the timestamp of the timer itself.
> 2. For a processing time timer, it is the current input watermark at the
> time of processing.
> But for both of these, we may want to reserve the right to output a
> particular time, aka set a "watermark hold".
> A naive implementation of a {{TimerWithWatermarkHold}} would work for making
> sure output is not droppable, but does not fully explain window expiration
> and late data/timer dropping.
> In the natural interpretation of a timer as a feedback loop on a transform,
> timers should be viewed as another channel of input, with a watermark, and
> items on that channel _all need event time timestamps even if they are
> delivered according to a different time domain_.
> I propose that the specification for when a timer should fire should be
> separated (with nice defaults) from the specification of the event time of
> resulting outputs. These timestamps will determine a side channel with a new
> "timer watermark" that constrains the output watermark.
> - We still need to fire event time timers according to the input watermark,
> so that event time timers fire.
> - Late data dropping and window expiration will be in terms of the minimum
> of the input watermark and the timer watermark. In this way, whenever a timer
> is set, the window is not going to be garbage collected.
> - We will need to make sure we have a way to "wake up" a window once it is
> expired; this may be as simple as exhausting the timer channel as soon as the
> input watermark indicates expiration of a window
> This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It
> seems reasonable to use timers as an implementation detail (e.g. in
> runners-core utilities) without wanting any of this additional machinery. For
> example, if there is no possibility of output from the timer callback.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)