yuqian90 commented on a change in pull request #10153:
URL: https://github.com/apache/airflow/pull/10153#discussion_r472553598



##########
File path: airflow/models/baseoperator.py
##########
@@ -382,7 +389,16 @@ def __init__(
                 stacklevel=3
             )
         validate_key(task_id)
-        self.task_id = task_id
+        self.label = task_id
+
+        # Prefix task_id with group_id
+        task_group = task_group or TaskGroupContext.get_current_task_group(dag)
+        if task_group:
+            self.task_id = f"{task_group.group_id}.{self.label}" if 
task_group.group_id else self.label

Review comment:
       Hi @houqp  `self.task_id` is only prefixed with `group_id` if the DAG 
starts to use TaskGroup. For existing dags not using TaskGroup, `self.task_id` 
remains equal to the passed in `task_id` and are unchanged.
    
   The reason I'm prefixing `self.task_id` with `group_id` is to avoid 
duplicated task_id in `dag.task_dict`. E.g, if we write code like this:
   ```
   
   def create_section():
       task1 = DummyOperator(task_id="task1")
       task2 = DummyOperator(task_id="task2")
   
   with DAG(...) as dag:
       with TaskGroup("section1") as section1:
           create_section()
   
       with TaskGroup("section2") as section2:
           create_section()
   ```
   
   The actual task_id stored in dag.task_dict will be this. This is to avoid 
duplication.
   ```
   section1.task1
   section1.task2
   section2.task1
   section2.task2
   ```
   The label users need to see in Graph View is this. I.e. they don't need to 
see the fully-qualified task_id 
   in the graph because the task is nested within the group.
   
   ```
   section1
       task1
       task2
   section2
       task1
       task2
   ```
   
   
   Alternatively, if we want to avoid changing the semantic of `self.task_id` 
here when using TaskGroup. We can ask the user to pass in `label` separately 
and let the user modify the task_id themselves to avoid duplication.
   I find this introduces extra burden to the user. However, maybe there are 
advantages to this too? What do you think? 
   
   ```
   def create_section(group_id):
       task1 = DummyOperator(task_id=f"{group_id}.task1", label="task1")
       task2 = DummyOperator(task_id="{group_id}.task2", label="task2")
   
   with DAG(...) as dag:
       with TaskGroup("section1") as section1:
           create_section(section1.group_id)
   
       with TaskGroup("section2") as section2:
           create_section(section2.group_id)
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to