alex-astronomer edited a comment on issue #16764:
URL: https://github.com/apache/airflow/issues/16764#issuecomment-1014936179


   
   Did some more research and it leads me to believe that if we consider the 
TaskGroup to be a "dependable" in the same way that we consider tasks able to 
depend on each other, that is: Taskgroups may depend on or be depended on Tasks 
and other TaskGroups, then we will be available to avoid many other problems 
that occur like this.
   
   ---
   
   *Appendix A: Expected Graph and Tree View*
   
   
![7280204D-5796-4741-8F2E-CB0E06C91A28](https://user-images.githubusercontent.com/89415310/149844546-425041a5-576f-4fe6-bcc7-c4124b8f62ec.png)
   
![CDDC9FF2-B1ED-476B-BF35-845FD6028BC0](https://user-images.githubusercontent.com/89415310/149844554-1cbf4e1e-8e73-41fa-a2e6-0bcf07c4cc98.png)
   
   ---
   
   I expect all definitions below to give a graph view, tree view, and actual 
running order to look like the pictures linked in Appendix A.
   
   Here are the definitions that I found that give the correct graph and tree 
view:
   
   ```
   with TaskGroup(‘tg’) as taskgroup:
       task1 = PythonOperator(task_id=‘hello1’, python_callable=_print_hello)
       task2 = PythonOperator(task_id=‘hello2’, python_callable=_print_hello)
       task1 >> task2
       start >> taskgroup >> end
   ```
   
   and
   
   ```
   with TaskGroup('tg') as taskgroup:  
       task1 = PythonOperator(task_id='hello1', python_callable=_print_hello)  
       task2 = PythonOperator(task_id='hello2', python_callable=_print_hello)  
       task1 >> task2  
     
   start >> taskgroup >> end
   ```
   
   ---
   
   The definition below gives a graph and tree view that are consistent with 
each other, but not correct and matching with Appendix A:
   
   ```
   with TaskGroup(‘tg’) as taskgroup:
       task1 = PythonOperator(task_id=‘hello1’, python_callable=_print_hello)
       task2 = PythonOperator(task_id=‘hello2’, python_callable=_print_hello)
       start >> taskgroup >> end
       task1 >> task2
   ```
   
   
![53DDE4F7-DA19-4FBB-A1D3-359F4F7763F4](https://user-images.githubusercontent.com/89415310/149845400-cb5e6cb5-611a-4333-8610-bd45c56451ec.png)
   
![C3AED63E-C82A-4D16-A82F-81E344566624](https://user-images.githubusercontent.com/89415310/149845405-60cf61a7-73f5-43a6-b2be-199b325a35bf.png)
   ---
   The definition below gives an inconsistent tree and graph view, as well as 
incorrect running order.  This is the example given by OP of this issue.
   
   ```
   with TaskGroup(‘tg’) as taskgroup:
       start >> taskgroup >> end
       task1 = PythonOperator(task_id=‘hello1’, python_callable=_print_hello)
       task2 = PythonOperator(task_id=‘hello2’, python_callable=_print_hello)
       task1 >> task2
   ```
   
![53DDE4F7-DA19-4FBB-A1D3-359F4F7763F4](https://user-images.githubusercontent.com/89415310/149844811-2635d8ae-c993-403d-a9c3-4e7c3e344fa4.png)
   
![C3AED63E-C82A-4D16-A82F-81E344566624](https://user-images.githubusercontent.com/89415310/149844820-b8a91a52-810a-4235-97b8-c17b4826d193.png)
   
   ---
   
   What we can see from the examples and the diagrams above is that there are a 
few events which depending on their order can affect the correctness of the 
dependencies in the DAG as well as the graph and tree view, which are sometimes 
inconsistent with each other.  The events that are significant in these 
definitions that I can see are:
   1. taskgroup variable defined
   2. internal tasks defined
   3. dependency set between `start >> taskgroup >> end`
   4. "internal" dependency set between `hello1 >> hello2`
   
   Before steps 2, 3, or 4 happens, we must ensure that step 1 has taken place. 
 This means that we are left with 3 steps that can have an interchangeable 
order and affect the graph view, tree view, and running order of the DAG.
   
   I believe that all of the definitions above should give the running order 
and graph/tree view specified in Appendix A.  This means that steps 2, 3, 4, 
from the above paragraph can be run in any order and the result will always be 
the same.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to