Hi Sergii, Window operator won't affect/adjust the output watermark, it just propagated as-is which is said in the document. I think the mistake here is you are using the wrong event-time of the window, actually, you should use TUMBLE_ROWTIME(...) as event_time [1]. The event-time of the window should be the maximal timestamp of the window, e.g. a window of [10:00, 11:00), the event-time of this window should be 10:59.999, not the start time. Because it indicates when this event happens, a window happens when the window is closed (the max timestamp). That's how TUMBLE_ROWTIME calculated in Flink SQL.
Best, Jark [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/sql/queries.html#selecting-group-window-start-and-end-timestamps On Mon, 22 Jun 2020 at 04:37, Sergii Mikhtoniuk <mikhton...@gmail.com> wrote: > Greetings, > > When playing around with the following simple event-time stream > aggregation: > > SELECT > TUMBLE_START(event_time, INTERVAL '1' DAY) as event_time, > ... > FROM input > GROUP BY TUMBLE(event_time, INTERVAL '1' DAY), symbol > > ...to my surprise I found out that the tumbling window operator has no > effect on the watermarks of the resulting append stream - the watermarks of > the input stream are propagated as-is. > > This seems to be a documented behavior > https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#interaction-of-watermarks-and-windows > but > it's still very counter-intuitive to me and I couldn't find any explanation > of it. > > My understanding of the watermarking is that MOST data is expected to > arrive with event time below the stream's watermark. Late events are either > discarded or should be handled as exceptional cases, e.g. via "allowed > lateness". > > So in my aggregation above I was expecting the result watermark to be > offset by ~1 day from the input and be emitted only after a tumbling window > closes. Instead, with input watermarks propagated as-is ALL events in the > resulting stream end up being late in relation to the current watermark... > Doesn't this behavior ruin the composition, as downstream operators will be > discarding all late data? > > I'd greatly appreciate if someone could shed light on this design decision. > > Thanks, > Sergii >