Okay, that makes sense. I'm not sure how to fix this, though. Can I suppose that someone from Dataflow team will take care of that?

On 11/1/19 12:16 AM, Kenneth Knowles wrote:
It is because Dataflow does not support TestStream, so one test is disabled, and because the other test has only bounded inputs it is run in batch mode. In this case we need to do either: force streaming mode on Dataflow or have an unbounded input. We used to run two Validates Runner suites, where one of them is forced to streaming mode for all tests. We really would like to also run that, actually. I don't see it anymore.

Kenn

On Thu, Oct 31, 2019 at 10:43 AM Jan Lukavský <[email protected] <mailto:[email protected]>> wrote:

    That is quite strange. The timer ordering tests were quite stable
    on DirectRunner. Prior to the fix it failed consistently. Dataflow
    on the other hand seems to consistently pass.

    On 10/31/19 6:28 PM, Kenneth Knowles wrote:
    Hmm, classical Dataflow should fail.

     - all user timers in a bundle processed first:
    
https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/SimpleParDoFn.java#L353
     - processed in a loop that drains the StepContext:
    
https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/SimpleParDoFn.java#L451
     - the context just feeds the iterable for the current bundle (no
    priority queue of newly set timers):
    
https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/StreamingModeExecutionContext.java#L550

    Looks like we need some more tests.

    Kenn

    On Thu, Oct 31, 2019 at 10:06 AM Jan Lukavský <[email protected]
    <mailto:[email protected]>> wrote:

        Hi,

        just today I noticed failures on portable dataflow [1] [2].
        "Classical" dataflow seems to pass.

        Jan

        [1] https://issues.apache.org/jira/browse/BEAM-8530

        [2] https://github.com/apache/beam/pull/9951

        On 10/31/19 5:29 PM, Reuven Lax wrote:
        Have you seen these failures on Dataflow as well? From code
        examination I would expect Dataflow to have some bugs in
        this area as well (especially if a timer is set while
        processing a bundle). If the tests are passing on Dataflow
        this might mean that we need different tests (or it might
        mean that Dataflow is "working" for some mysterious reason
        that is not obvious from the code :) ).

        On Wed, Oct 23, 2019 at 2:54 AM Jan Lukavský
        <[email protected] <mailto:[email protected]>> wrote:

            Hi,

            as part of [1] a new set of validatesRunner tests has
            been introduced.
            These tests (currently marked as category
            UsesStrictTimerOrdering)
            verify that runners fire timers in increasing timestamp
            under all
            circumstances. After adding these validatesRunner tests,
            Samza [2] and
            Portable Flink [3] started to fail these tests. I have
            created the
            tracking issues for that, because that behavior should
            be fixed (timers
            in wrong order can cause erratic behavior and/or data loss).

            I'm writing to anyone interested in solving these issues.

            Cheers,

              Jan

            [1] https://issues.apache.org/jira/browse/BEAM-7520

            [2] https://issues.apache.org/jira/browse/BEAM-8459

            [3] https://issues.apache.org/jira/browse/BEAM-8460

Reply via email to