[
https://issues.apache.org/jira/browse/BEAM-10676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176813#comment-17176813
]
Maximilian Michels commented on BEAM-10676:
-------------------------------------------
Small addition: By design, processing timers set the timer output timestamp to
the current input timestamp. In this issue, we adapted the Python SDK to the
behavior of the Java SDK, which is to set the output timestamp of event time
timers to the fire timestamp.
> Timers use the input timestamp as the timer output timestamp which prevents
> watermark progress
> ----------------------------------------------------------------------------------------------
>
> Key: BEAM-10676
> URL: https://issues.apache.org/jira/browse/BEAM-10676
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core, sdk-py-harness
> Reporter: Maximilian Michels
> Assignee: Maximilian Michels
> Priority: P2
> Fix For: 2.24.0
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> By default, the Python SDK adds a timer output timestamp equal to the current
> timestamp of an element. This is problematic because
> 1. We hold back the output watermark on the current element's timestamp for
> every timer
> 2. It doesn't match the behavior in the Java SDK which defaults to using the
> fire timestamp as the timer output timestamp (and adds a hold on it)
> 3. There is no way for the user to influence this behavior because there is
> no user-facing API
> https://github.com/apache/beam/blob/dfadde2d3ee0a0487362dbcca80388fdc2ef2302/sdks/python/apache_beam/runners/worker/bundle_processor.py#L650
> We should use the fire timestamp as the default output timestamp.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)