kennknowles commented on pull request #14718: URL: https://github.com/apache/beam/pull/14718#issuecomment-847203179
Taking a concrete example, suppose you have fixed windows of an hour. What (b) allows is to output an element with event timestamp 3:45pm in the window [2:00pm - 3:00pm). So naively, it is not logical to output at a timestamp where the `WindowFn` would not assign the window. However, we do allow this elsewhere. You can do `Window.into(...)` and then immediately do a `ParDo` that moves the timestamps. I _think_ in this case you can move the timestamp past the window max but I do not recall without checking the code. So that is why I am OK with (b) or (d). Generally they are consistent with other things that you can do, even though they may not make sense. I think (c) is the one that is obviously logical. It means that the element always actually falls within the window that it is assigned. That is sort of the point of windows, really. I think SQL has the clearest interpretation: a window is just another thing from the GROUP BY that has some way of knowing when it is done. There is no concept of decoupling window and element timestamp - you just GROUP BY window. Once the window has been added to a row, the other values don't really matter. But the watermark needs to be faithful to the window. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
