By the way. The way I see to fixing this is extending WindowAssigner with
an "isEventTime()" method and then allow accumulating/lateness in the
WindowOperator only if this is true.

But it seems a but hacky because it special cases event-time. But then
again, maybe we need to special case it ...

On Tue, 5 Apr 2016 at 12:23 Aljoscha Krettek <aljos...@apache.org> wrote:

> Hi Folks,
> as part of my effort to improve the windowing in Flink [1] I also thought
> about lateness, accumulating/discarding and window cleanup. I have some
> ideas on this but I would love to get feedback from the community as I
> think that these things are important for everyone doing event-time
> windowing on Flink.
>
> The basic problem is this: Some elements can arrive behind the watermark
> if the watermark is not 100 % correct (which it is not, in most cases, I
> would assume). We need to provide API that allows to specify what happens
> when these late elements arrive. There are two main knobs for the user here:
>
> - Allowed Lateness: How late can an element be before it is completely
> ignored, i.e. simply discarded
>
> - Accumulating/Discarding Fired Windows: When we fire a window, do we
> purge the contents or do we keep it around until the watermark passes the
> end of end window plus the allowed lateness? If we keep the window a late
> element will be added to the window and the window will be emitted again.
> If don't keep the window then the late element will essentially trigger
> emission of a one-element window.
>
> This is somewhat straightforward to implement: If accumulating set a timer
> for the end of the window plus the allowed lateness. Cleanup the window
> when that fires (basically). All in event-time with watermarks.
>
>  My problem is only this: what should happen if the user specifies some
> allowed lateness and/or accumulating mode but uses processing-time
> windowing. For processing-time windows these don't make sense because
> elements cannot can be late by definition. The problem is, that we cannot
> figure out, by looking at a WindowAssigner or the Windows that it assigns
> to elements whether these windows are in event-time or processing-time
> domain. At the API level this is also not easily visible, since a user
> might have set the "stream-time-characteristic" to event-time but still use
> a processing-time window (plus trigger) in the program.
>
> Any ideas for solving this are extremely welcome. :-)
>
> Cheers,
> Aljoscha
>
> [1]
> https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit#heading=h.psfzjlv68tp
>

Reply via email to