[
https://issues.apache.org/jira/browse/BEAM-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859960#comment-16859960
]
Jan Lukavský commented on BEAM-7520:
------------------------------------
I have experimented with this a little and have not yet figured out what the
correct solution should be. What I tried:
1) hold input watermark for min(setup timers)
2) fire timers based not on input watermark, but on output watermark (output
watermark is held by min timer stamp)
The code seems to be a little complicated there, so I was not able to get that
working for 100%, more over maybe neither option (1, 2) seems to be completely
correct.
What would seem to be correct to me is introduce another watermark (operator
watermark), with the following properties:
- at all times input watermark >= operator watermark >= output watermark holds,
- when input watermark advances from T1 to T2, operator watermark moves from
T1 to T2 in order of timers that are set for this time interval
- timers are fired based on operator watermark
This solution seems to be working, but looks a little complicated. Is there a
simpler solution?
> DirectRunner timers are not strictly time ordered
> -------------------------------------------------
>
> Key: BEAM-7520
> URL: https://issues.apache.org/jira/browse/BEAM-7520
> Project: Beam
> Issue Type: Bug
> Components: runner-direct
> Affects Versions: 2.13.0
> Reporter: Jan Lukavský
> Priority: Major
>
> Let's suppose we have the following situation:
> - statful ParDo with two timers - timerA and timerB
> - timerA is set for window.maxTimestamp() + 1
> - timerB is set anywhere between <windowStart, windowEnd), let's denote that
> timerB.timestamp
> - input watermark moves to BoundedWindow.TIMESTAMP_MAX_VALUE
> Then the order of timers is as follows (correct):
> - timerB
> - timerA
> But, if timerB sets another timer (say for timerB.timestamp + 1), then the
> order of timers will be:
> - timerB (timerB.timestamp)
> - timerA (BoundedWindow.TIMESTAMP_MAX_VALUE)
> - timerB (timerB.timestamp + 1)
> Which is not ordered by timestamp. The reason for this is that when the input
> watermark update is evaluated, the WatermarkManager,extractFiredTimers() will
> produce both timerA and timerB. That would be correct, but when timerB sets
> another timer, that breaks this.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)