boyuanzz commented on pull request #14268: URL: https://github.com/apache/beam/pull/14268#issuecomment-813504955
> > A high level question regarding to > > > it was added as response to a real user problem: sliding windows + EARLIEST timestamp combiner delays downstream aggregations a lot > > > but when that problem occurred, timestamp combiner EARLIEST was the default; with the change to how lateness is defined we changed the default to END_OF_WINDOW > > > > > > With this PR, what will happen to the real user problem mentioned above? > > Short answer: the real user problem will be back but not as bad. > > Longer answer: EARLIEST used to be the default when `getOutputTime` was introduced to solve the problem. Now the default is END_OF_WINDOW so the problem is not as bad. To use sliding windows (or other overlapping windows) with EARLIEST, the user has some choices: > > * Just be OK with the delay of downstream GBK (already true for Python) > * Set up a non-default trigger (this will free watermark holds and allow progress downstream) > * Manually move element timestamps forward with a `ParDo` before GBK, achieving identical behavior > > I think further discussion should continue on the dev@ thread instead of the PR though. Thanks for the explanation. If it's something more like a common issue, we should document this in the release note or somewhere else. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
