dstandish opened a new pull request, #33358:
URL: https://github.com/apache/airflow/pull/33358

   With this kind of dag if you clear w1 downstream then you also clear w2:
   
   ```python
       s1 >> w1 >> [w2, t1]
       s1 >> t1
       s2 >> t2
       s2 >> w2 >> t2
   ```
   
   We need to make sure that the setup for w2 also gets cleared.  But, to avoid 
the need to recurse to arbitrary depth for setups of setups, let's just say 
that a setup cannot have a setup.  A setup can *come after* another setup, but 
it won't *be* a setup for the setup (and what's at stake is just the clearing 
behavior).
   
   Additional notes...
   
   I created an intermediate set `also_include_ids` because it's possible we'll 
hit the same tasks multiple times.  Operator is hashable, but the hash attrs 
are mutable so it feels icky to have a set of them.  Simpler with strings.  But 
I think it would be fine if anyone thinks it's better to just use the operators.
   
   You may also notice that I added some logic, guarding a couple lines with 
`if not t.is_setup`.  This is to essentially say, a setup is assumed not to 
"have" a setup.  That is, one setup can come before another, but that's just 
precedence -- the one is not a setup "for" the other.
   
   The reason I believe we must do this is, if we say that a setup can have 
another setup (or if a teardown can have a setup / teardown), then, that would 
mean that, when we encounter a setup in a downstream clear, we would have to 
recurse it for its upstream setups.  But this would be quite annoying, and it's 
not worth it because it's hard to imagine a valid use case for that.  So, while 
we have to reach out for the setups of downstream _work_ tasks, since we know 
we'll only get setups and teardowns, then we know we can stop there (since 
we've said that a setup and teardown can't _itself_ have a setup and teardown); 
thus we know we do not have to recurse further for more setups and teardowns to 
clear.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to