Sam Rohde created BEAM-8582:
-------------------------------
Summary: Python SDK emits duplicate records for Default and
AfterWatermark triggers
Key: BEAM-8582
URL: https://issues.apache.org/jira/browse/BEAM-8582
Project: Beam
Issue Type: Bug
Components: sdk-py-core
Reporter: Sam Rohde
Assignee: Sam Rohde
This was found after fixing https://issues.apache.org/jira/browse/BEAM-8581.
The fix for 8581 was to pass in the input watermark. Previously, it was using
MIN_TIMESTAMP for all of its EOW calculations. By giving it a proper input
watermark, this bug started to manifest.
The DefaultTrigger and AfterWatermark do not clear their timers after the
watermark passed the end of the endow, leading to duplicate records being
emitted.
Fix: Clear the watermark timer when the watermark reaches the end of the window.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)