[
https://issues.apache.org/jira/browse/BEAM-7520?focusedWorklogId=297977&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297977
]
ASF GitHub Bot logged work on BEAM-7520:
----------------------------------------
Author: ASF GitHub Bot
Created on: 20/Aug/19 15:54
Start Date: 20/Aug/19 15:54
Worklog Time Spent: 10m
Work Description: je-ik commented on issue #9190: [BEAM-7520] Fix timer
firing order in DirectRunner
URL: https://github.com/apache/beam/pull/9190#issuecomment-523078430
I need to push this forward, because it is prerequisite for more PRs that I
have (mostly) ready. But these need to be reliably testable, so sorry if I'm a
little impatient. :-)
@kennknowles, if you don't have time for this, would you please delegate
this?
Open questions:
* is the current approach OK? It is tested in terms of not causing
deadlocks, performance should be somewhat comparable to current master - I'm
aware that the completely correct solution would be to evaluate and add timers
to *current* bundle, not rescheduling it for later, but that seems to be major
refactor (please, anyone confirm or disprove this statement as it might be
wrong)
* there was observed strange behavior related to `Create` vs. `TestStream`
in DirectRunner, but it is not reproducible in this PR. It was observed and
described in https://github.com/rezarokni/beam/pull/2#issuecomment-522226493
* added test for `Create` instead of `TestStream` broke only
ValidatesRunner of Dataflow (all other runners seem to pass)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 297977)
Time Spent: 6h 40m (was: 6.5h)
> DirectRunner timers are not strictly time ordered
> -------------------------------------------------
>
> Key: BEAM-7520
> URL: https://issues.apache.org/jira/browse/BEAM-7520
> Project: Beam
> Issue Type: Bug
> Components: runner-direct
> Affects Versions: 2.13.0
> Reporter: Jan Lukavský
> Assignee: Jan Lukavský
> Priority: Major
> Time Spent: 6h 40m
> Remaining Estimate: 0h
>
> Let's suppose we have the following situation:
> - statful ParDo with two timers - timerA and timerB
> - timerA is set for window.maxTimestamp() + 1
> - timerB is set anywhere between <windowStart, windowEnd), let's denote that
> timerB.timestamp
> - input watermark moves to BoundedWindow.TIMESTAMP_MAX_VALUE
> Then the order of timers is as follows (correct):
> - timerB
> - timerA
> But, if timerB sets another timer (say for timerB.timestamp + 1), then the
> order of timers will be:
> - timerB (timerB.timestamp)
> - timerA (BoundedWindow.TIMESTAMP_MAX_VALUE)
> - timerB (timerB.timestamp + 1)
> Which is not ordered by timestamp. The reason for this is that when the input
> watermark update is evaluated, the WatermarkManager,extractFiredTimers() will
> produce both timerA and timerB. That would be correct, but when timerB sets
> another timer, that breaks this.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)