kennknowles commented on pull request #14718:
URL: https://github.com/apache/beam/pull/14718#issuecomment-847203179


   Taking a concrete example, suppose you have fixed windows of an hour. What 
(b) allows is to output an element with event timestamp 3:45pm in the window 
[2:00pm - 3:00pm). So naively, it is not logical to output at a timestamp where 
the `WindowFn` would not assign the window.
   
   However, we do allow this elsewhere. You can do `Window.into(...)` and then 
immediately do a `ParDo` that moves the timestamps. I _think_ in this case you 
can move the timestamp past the window max but I do not recall without checking 
the code. So that is why I am OK with (b) or (d). Generally they are consistent 
with other things that you can do, even though they may not make sense.
   
   I think (c) is the one that is obviously logical. It means that the element 
always actually falls within the window that it is assigned. That is sort of 
the point of windows, really.
   
   I think SQL has the clearest interpretation: a window is just another thing 
from the GROUP BY that has some way of knowing when it is done. There is no 
concept of decoupling window and element timestamp - you just GROUP BY window. 
Once the window has been added to a row, the other values don't really matter. 
But the watermark needs to be faithful to the window.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to