What is the runner supposed to be doing to trigger the teardown of given bundle descriptors in an SDK harness?
Is there a fn API call I'm not interpreting correctly that should reliably trigger DoFn teardown, or generally that bundle processing is done? On Wed, May 15, 2019, 6:51 AM Robert Bradshaw <rober...@google.com> wrote: > This does bring up an interesting question though. Are runners > violating (the intent of) the spec if they simply abandon/kill workers > rather than gracefully bringing them down (e.g. so that these > callbacks can be invoked)? > > On Tue, May 7, 2019 at 3:55 PM Michael Luckey <adude3...@gmail.com> wrote: > > > > Thanks Kenn and Reuven. Based on your feedback, I amended to the PR [1] > implementing the missing calls to teardown. > > > > Best, > > > > michel > > > > [1] https://github.com/apache/beam/pull/8495 > > > > On Tue, May 7, 2019 at 6:09 AM Kenneth Knowles <k...@apache.org> wrote: > >> > >> > >> > >> On Mon, May 6, 2019 at 2:19 PM Reuven Lax <re...@google.com> wrote: > >>> > >>> > >>> > >>> On Mon, May 6, 2019 at 2:06 PM Kenneth Knowles <k...@apache.org> > wrote: > >>>> > >>>> The specification of TearDown is that it is best effort, certainly. > >>> > >>> > >>> Though I believe the intent of that specification was that a runner > will call it as long as the process itself has not crashed. > >> > >> > >> Yea, exactly. Or more abstractly that a runner will call it unless it > is impossible. If the hardware fails, a meteor strikes, etc, then teardown > will not be called. But in normal operation, particularly when the user > code throws a recoverable exception, it should be called. > >> > >> Kenn > >> > >>> > >>> > >>>> > >>>> If your runner supports it, then the test is good to make sure there > is not a regression. If your runner has partial support, that is within > spec. But the idea of the spec is more than you might have such a failure > that it is impossible to call the method, not simply never trying to call > it. > >>>> > >>>> I think it seems to match what we do elsewhere to leave the test, add > an annotation, make a note in the capability matrix about the limitation on > ParDo. > >>>> > >>>> Kenn > >>>> > >>>> On Mon, May 6, 2019 at 5:45 AM Michael Luckey <adude3...@gmail.com> > wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> after stumbling upon [1] and trying to implement a fix [2], > ParDoLifeCycleTest are failing for > >>>>> direct runner, spark validatesRunnerBatch and flink > validatesRunnerBatch fail as DoFns teardown is not invoked, if DoFns setup > throw an exceptions. > >>>>> > >>>>> This seems to be in line with the specification [3], as this > explicitly states that 'teardown might not be called if unnecessary as > processed will be killed anyway'. > >>>>> > >>>>> No I am a bit lost on how to resolve this situation. Currently, we > seem to have following options > >>>>> - remove the test, although it seems valuable in different (e.g. > streaming?) cases > >>>>> - to satisfy the test implement the call to teardown in runners > although it seems unnecessary > >>>>> - add another annotation @CallsTeardownAfterFailingSetup, > @UsesFullParDoLifeCycle or such (would love to get suggestions for better > name here) > >>>>> - ? > >>>>> > >>>>> Thoughts? > >>>>> > >>>>> Best, > >>>>> > >>>>> michel > >>>>> > >>>>> > >>>>> > >>>>> [1] https://issues.apache.org/jira/browse/BEAM-7197 > >>>>> [2] https://github.com/apache/beam/pull/8495 > >>>>> [3] > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L676-L680 >