Okay, that makes sense. I'm not sure how to fix this, though. Can I
suppose that someone from Dataflow team will take care of that?
On 11/1/19 12:16 AM, Kenneth Knowles wrote:
It is because Dataflow does not support TestStream, so one test is
disabled, and because the other test has only bounded inputs it is run
in batch mode. In this case we need to do either: force streaming mode
on Dataflow or have an unbounded input. We used to run two Validates
Runner suites, where one of them is forced to streaming mode for all
tests. We really would like to also run that, actually. I don't see it
anymore.
Kenn
On Thu, Oct 31, 2019 at 10:43 AM Jan Lukavský <[email protected]
<mailto:[email protected]>> wrote:
That is quite strange. The timer ordering tests were quite stable
on DirectRunner. Prior to the fix it failed consistently. Dataflow
on the other hand seems to consistently pass.
On 10/31/19 6:28 PM, Kenneth Knowles wrote:
Hmm, classical Dataflow should fail.
- all user timers in a bundle processed first:
https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/SimpleParDoFn.java#L353
- processed in a loop that drains the StepContext:
https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/SimpleParDoFn.java#L451
- the context just feeds the iterable for the current bundle (no
priority queue of newly set timers):
https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingModeExecutionContext.java#L550
Looks like we need some more tests.
Kenn
On Thu, Oct 31, 2019 at 10:06 AM Jan Lukavský <[email protected]
<mailto:[email protected]>> wrote:
Hi,
just today I noticed failures on portable dataflow [1] [2].
"Classical" dataflow seems to pass.
Jan
[1] https://issues.apache.org/jira/browse/BEAM-8530
[2] https://github.com/apache/beam/pull/9951
On 10/31/19 5:29 PM, Reuven Lax wrote:
Have you seen these failures on Dataflow as well? From code
examination I would expect Dataflow to have some bugs in
this area as well (especially if a timer is set while
processing a bundle). If the tests are passing on Dataflow
this might mean that we need different tests (or it might
mean that Dataflow is "working" for some mysterious reason
that is not obvious from the code :) ).
On Wed, Oct 23, 2019 at 2:54 AM Jan Lukavský
<[email protected] <mailto:[email protected]>> wrote:
Hi,
as part of [1] a new set of validatesRunner tests has
been introduced.
These tests (currently marked as category
UsesStrictTimerOrdering)
verify that runners fire timers in increasing timestamp
under all
circumstances. After adding these validatesRunner tests,
Samza [2] and
Portable Flink [3] started to fail these tests. I have
created the
tracking issues for that, because that behavior should
be fixed (timers
in wrong order can cause erratic behavior and/or data loss).
I'm writing to anyone interested in solving these issues.
Cheers,
Jan
[1] https://issues.apache.org/jira/browse/BEAM-7520
[2] https://issues.apache.org/jira/browse/BEAM-8459
[3] https://issues.apache.org/jira/browse/BEAM-8460