eladkal commented on a change in pull request #20700:
URL: https://github.com/apache/airflow/pull/20700#discussion_r779858794
##########
File path: docs/apache-airflow/concepts/dags.rst
##########
@@ -609,6 +611,37 @@ Note that :doc:`pools` are *not honored* by
:class:`~airflow.operators.subdag.Su
resources could be consumed by SubdagOperators beyond any limits you may have
set.
+TaskGroups vs SubDAGs
+----------------------
+
+SubDAGs, while serving a similar purpose as TaskGroups, introduces both
performance and functional issues due to its implementation.
+
+* The SubDagOperator starts a BackfillJob, which ignores existing parallelism
configurations potentially oversubscribing the worker environment.
+* SubDAGs have their own DAG attributes. When the SubDAG DAG attributes are
inconsistent with its parent DAG, unexpected behavior can occur.
+* Unable to see the "full" DAG in one view as SubDAGs exists as a full fledged
DAG.
+* SubDAGs introduces all sorts of edge cases and caveats. This can disrupt
user experience and expectation.
+
+TaskGroups, on the other hand, is a better option given that it is purely a UI
grouping concept. All tasks within the TaskGroup still behave as any other
tasks outside of the TaskGroup.
+
+You can see the core differences between these two constructs.
+
++--------------------------------------------------------+--------------------------------------------------------+
+| TaskGroup | SubDAG
|
++========================================================+========================================================+
+| Repeating patterns as part of the same DAG | Repeating patterns
as a separate DAG |
++--------------------------------------------------------+--------------------------------------------------------+
+| One set of views and statistics for the DAG | Separate set of
views and statistics between parent |
+| | and child DAGs
|
++--------------------------------------------------------+--------------------------------------------------------+
+| One set of DAG configuration | Several sets of
DAG configurations |
++--------------------------------------------------------+--------------------------------------------------------+
+| Honors parallelism configurations through existing | Does not honor
parallelism configurations due to |
+| SchedulerJob | newly spawned
BackfillJob |
++--------------------------------------------------------+--------------------------------------------------------+
+| Simple construct declaration with context manager | Complex DAG
factory with naming restrictions |
++--------------------------------------------------------+--------------------------------------------------------+
+
+
Review comment:
I would suggest to add conclusion here that SubDAG are deprecated hence
TaskGroup is always the prefered choice.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]