[
https://issues.apache.org/jira/browse/BEAM-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17530105#comment-17530105
]
Reuven Lax commented on BEAM-14239:
-----------------------------------
strawperson proposal 1: we stop adding output timestamp to the tag, and every
time a timer fires we read the watermark hold state to see what the output
timestamp is.
main problem with this is that the extra read is expensive, and this strategy
will cause us to do it for _every_ timer, even the ones that never set an
output timestamp. so that leads me to
strawperson proposal 2: use the extra part of the timer tag to indicate that a
watermark hold is needed, storing a constant value there. We fetch the
watermark hold only if this bit is present in the tag.
Two problems with this proposal:
1. I'm not sure how to issue state fetches from inside
WindmillTimerInternals. Presumably it's possible to plumb something down.
2. What if user code resets a timer that was set with an output timestamp and
doesn't set an output timestamp? This will cause us to stop setting the extra
bit in the tag, and again both timers will fire.
We could also change the API so that the user needs to statically declare
whether given timers will have holds or not. However this is a fairly
backwards-incompatible change to the timer API.
> Changing the output timestamp of a timer does not clear the previously set
> timer
> --------------------------------------------------------------------------------
>
> Key: BEAM-14239
> URL: https://issues.apache.org/jira/browse/BEAM-14239
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Affects Versions: 2.37.0
> Reporter: Steve Niemitz
> Priority: P1
> Attachments: image-2022-04-04-09-57-29-583.png
>
>
> While looking into an unrelated bug with GroupIntoBatches, I noticed that it
> seems like changing the output timestamp of a timer does not clear the
> existing timer, and instead creates a new one.
> This kind of makes sense looking at the implementation of timers in Dataflow,
> the output timestamp is encoded into the timer ID, but this is not reflected
> in the timerStillPresent map in WindmillTimerInternals. It seems like it
> should be, and the previous timer should be deleted.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)