Thanks Reuven, So is my conclusion correct? That it is illegal for any custom window function (+ combiner policy) to merge in a way that would regress the watermark?
What do Runners (eg Dataflow) do if this occurs? Does the API obligate runners to fail, or can insanity ensue? :) On Tue, Nov 5, 2019 at 10:55 AM Reuven Lax <[email protected]> wrote: > > > On Tue, Nov 5, 2019 at 8:07 AM Aaron Dixon <[email protected]> wrote: > >> I noticed that if I use TimestampCombiner/EARLIEST for session windows >> that the watermark appears to get held up for sessions that never "close" >> (or that extend for a long time). >> > > Correct - because the watermark is then being held up by the earliest > timestamp in any extant session window. > > >> But if I use default (TimestampCombiner/END_OF_WINDOW) the watermark >> doesn't get held. >> > > Yes - because then the watermark gets held up by the current end of window. > > >> >> Does this mean that the watermark is adjusted whenever windows are >> merged, even before they "close"? >> > > In the second case, yes. Every time windows merge, the end of window for > that key is recalculated. The actual watermark will be the minimum of all > these end-of-windows (as each window is per key) > > >> If that is the case, and I write a custom WindowFn, is this implication >> of this that I should never move the `end`/`maxTimestamp` of the new/merged >> window *backwards* in time? >> >> >> >>
