tvalentyn commented on code in PR #23828:
URL: https://github.com/apache/beam/pull/23828#discussion_r1006272045
##########
sdks/python/apache_beam/transforms/core.py:
##########
@@ -2779,14 +2779,20 @@ def StripNonce(nonce_key_value):
cold, hot = pcoll | ParDo(SplitHotCold()).with_outputs('hot', main='cold')
cold.element_type = typehints.Any # No multi-output type hints.
+ precombined_hot = hot
+ if pcoll.windowing.accumulation_mode == AccumulationMode.ACCUMULATING:
+ # Avoid double-counting that may happen with stacked accumulating mode.
+ precombined_hot |= 'WindowIntoDiscarding' >> WindowInto(
+ pcoll.windowing, accumulation_mode=AccumulationMode.DISCARDING)
+
precombined_hot = (
- hot
- # Avoid double counting that may happen with stacked accumulating mode.
- | 'WindowIntoDiscarding' >> WindowInto(
Review Comment:
I think current PR is the best fix I have so far, alternative is to skip
`WindowIntoDiscarding` only if `SlidingWindows` are detected instead of
applying it whenever accumulation_mode=AccumulationMode.DISCARDING is detected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]