bbovenzi commented on issue #26059:
URL: https://github.com/apache/airflow/issues/26059#issuecomment-1468742220

   Did a bunch of testing. 
   To replicate: clear a any task in a task group that contains a join node. In 
our example_dags, `example_task_group.section_2.task_1` shows this well. The 
join node will disappear after clearing the task and since the node is missing 
we cannot connect the two task groups anymore and the graph breaks.
   
   This happens because clear calls `dag.partial_subset()` and that function is 
not properly copying the `dag.task_group` in its memo 
[here](https://github.com/apache/airflow/blob/main/airflow/models/dag.py#L2183).
 Removing the task_group memo and calling `filter_task_group` 
[here](https://github.com/apache/airflow/blob/main/airflow/models/dag.py#L2240) 
with the copied `dag.task_group` instead of `self.task_group`. It works, but is 
significantly slower for large dags (2000+ tasks)
   I'm not quite sure the best way to fix our deep copy memo.
   
   We do use `partial_subset` for filtering upstream/downstream too and you can 
replicate that way too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to