bbovenzi commented on code in PR #30129:
URL: https://github.com/apache/airflow/pull/30129#discussion_r1137451818


##########
airflow/models/dag.py:
##########
@@ -2215,29 +2215,29 @@ def _deepcopy_task(t) -> Operator:
 
         def filter_task_group(group, parent_group):
             """Exclude tasks not included in the subdag from the given 
TaskGroup."""
-            copied = copy.copy(group)
-            copied.used_group_ids = set(copied.used_group_ids)
-            copied._parent_group = parent_group
-
-            copied.children = {}
+            memo = {id(group.children): {}}
+            if parent_group:
+                memo[id(group.parent_group)] = parent_group
+            copied = copy.deepcopy(group, memo)

Review Comment:
   Testing on a very large dag. I don't think this memo is working right. 127ms 
to 22s...
   
   Before:
   <img width="802" alt="Screenshot 2023-03-15 at 12 58 21 PM" 
src="https://user-images.githubusercontent.com/4600967/225384422-22b003c0-67d6-4219-8c7d-2e543f9f80c1.png";>
   
   After:
   <img width="794" alt="Screenshot 2023-03-15 at 12 55 43 PM" 
src="https://user-images.githubusercontent.com/4600967/225383299-c172fb50-30d0-4e55-b8e5-5bd224082203.png";>
   
   
   DAG:
   ```
   from datetime import datetime
   
   from airflow.models.dag import DAG
   from airflow.operators.dummy import DummyOperator
   from airflow.decorators import task_group
   
   with DAG(
       "wide_dummy",
       schedule_interval=None,
       start_date=datetime(2021, 1, 1),
       catchup=True,
   ) as wide_dummy:
   
       for i in range(100):
   
           @task_group(group_id=f"group-{i}")
           def group():
               for t in range(10):
                   DummyOperator(task_id=f"out_{i}_{t}")
                   DummyOperator(task_id=f"out2_{i}_{t}")
   
           group()
     ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to