re

> 2. `task1 >> task2 >> teardown_task` to me falsely implies that teardown
> depends on task2, But it doesn't. It only depends on the "scope being
> exited".


So that's not quite the case.  With the proposed implementation, there's no
such scope concept.  They're just normal tasks, with special features.
With the above code, teardown_task does depend on task2.  A teardown will
run if its setup is successful and its non-setup upstreams are done.

re this one:

with TaskGroup("tg1"):
>     task1 >> teardown_task
>     task2 >> task3 >> task4 >> task5


With the proposed implementation, teardown runs after task 1 and doesn't
wait for task2 or its downstreams.  And the same rule applies.  Teardown
will run if its setups are successful and non-setup upstreams are done.  In
this case there's no setup so it is in effect all_done.

And one might ask, what's the purpose / benefit of having a teardown here?
Well, it's just that task1 requires the teardown and the others don't. And,
as a teardown, (1) even if it is happens to be a dag leaf, it can be
ignored for the purpose of dag run state and (2) teardowns are ignored as a
task group leaf when arrowing tg1 >> next_task so that its success is not
required for the dag to continue running after task1 is done.

Adding a setup to the example doesn't change too much:

with TaskGroup("tg1"):
    setup1 >> task1 >> teardown_task
    setup1 >> teardown_task
    task2 >> task3 >> task4 >> task5

So in this case we still just have two parallel sequences of tasks.
teardown_task will run if setup1 is successful and task1 is done.

task2 and downstreams don't care about task 1 and its setups / teardowns.
They are very much normal tasks that operate in the same way but you get
the power of the special behaviors re clearing, configurable dag run state
impact, and continuing downstream in spite of failure.

If task2 etc require the setup then you just add the arrows.








On Thu, Mar 23, 2023 at 4:12 PM Ash Berlin-Taylor <a...@apache.org> wrote:

> I'm obviously in favour of the way the AIP was written, and that's for two
> primary reasons.
>
> 1. It's analogous to setup and teardown in testing frameworks where you
> don't ever explicitly call them - the framework handles it for you.
> 2. `task1 >> task2 >> teardown_task` to me falsely implies that teardown
> depends on task2, But it doesn't. It only depends on the "scope being
> exited".
>
> And as for the Zen of Python point: python itself doesn't even follow them
> well. There are three ways of formatting strings in python.
>
> On thinking a bit more about it, I think I have a counter point to where I
> think explicit dependencies lead to a false expectation:
>
> ```
> with TaskGroup("tg1"):
>     task1 ≥≥ teardown_task
>
>     task2 >> task3 >> task4 >> task5
> ```
>
> Does teardown run as soon as task 1 is finished, or when all of task1 and
> task5 are finished?
>
> I very strongly believe that teardown should only run at the end of a
> TaskGroup - a hard rule on this makes it easier for users to reason about
> and understand it. If it's only as a result of it's explicit dependencies
> then it means users have to reason about when each teardown task is run in
> each situation as it might be different from dag to dag.
>
> In this case the teardown is akin to a "finally" block in python, and the
> TaskGroup is the "try" block, which I hope is a concept that almost
> everyone writing DAGs will understand and be able to relate too.
>
> Teardown tasks are already special in a number of ways (clearing
> behaviour, special failure rules for resulting dag run, different trigger
> rule) so users need to know how it works.
>
> So I vote for keeping it implicit only, but if we as a community favour
> explicit only then we need to have an enforced requirement that there only
> leaves of a TG can be teardown (if they are in use) -- i.e. `[taak1, task5]
> >> teardown_task` would be required in this case.
>
> (And a similar role for set up. If there are any, the only root tasks in a
> TG must be set up)
>
> Ash
>
> On 23 March 2023 22:16:42 GMT, Pierre Jeambrun <pierrejb...@gmail.com>
> wrote:
> >I am also in favor of explicit relationships only.
> >
> >From a person who didn't work on AIP-52, it seems easier to understand
> what
> >is going on without having to dive into the setup/teardown documentation.
> >
> >Le jeu. 23 mars 2023 à 22:53, Jed Cunningham <jedcunning...@apache.org> a
> >écrit :
> >
> >> I've been working closely with Daniel on AIP-52 for a while now, but
> I'll
> >> still share my thoughts here.
> >>
> >> I'm also in favor of only supporting explicit relationships.
> >>
> >> In my opinion, even if multiple setup/teardown per scope never
> materialize,
> >> explicit relationships are still a better choice due to the clarity it
> >> brings today.
> >>
>

Reply via email to