dstandish opened a new pull request, #33358:
URL: https://github.com/apache/airflow/pull/33358
With this kind of dag if you clear w1 downstream then you also clear w2:
```python
s1 >> w1 >> [w2, t1]
s1 >> t1
s2 >> t2
s2 >> w2 >> t2
```
We need to make sure that the setup for w2 also gets cleared. But, to avoid
the need to recurse to arbitrary depth for setups of setups, let's just say
that a setup cannot have a setup. A setup can *come after* another setup, but
it won't *be* a setup for the setup (and what's at stake is just the clearing
behavior).
Additional notes...
I created an intermediate set `also_include_ids` because it's possible we'll
hit the same tasks multiple times. Operator is hashable, but the hash attrs
are mutable so it feels icky to have a set of them. Simpler with strings. But
I think it would be fine if anyone thinks it's better to just use the operators.
You may also notice that I added some logic, guarding a couple lines with
`if not t.is_setup`. This is to essentially say, a setup is assumed not to
"have" a setup. That is, one setup can come before another, but that's just
precedence -- the one is not a setup "for" the other.
The reason I believe we must do this is, if we say that a setup can have
another setup (or if a teardown can have a setup / teardown), then, that would
mean that, when we encounter a setup in a downstream clear, we would have to
recurse it for its upstream setups. But this would be quite annoying, and it's
not worth it because it's hard to imagine a valid use case for that. So, while
we have to reach out for the setups of downstream _work_ tasks, since we know
we'll only get setups and teardowns, then we know we can stop there (since
we've said that a setup and teardown can't _itself_ have a setup and teardown);
thus we know we do not have to recurse further for more setups and teardowns to
clear.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]