This is an automated email from the ASF dual-hosted git repository.
eladkal pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git
The following commit(s) were added to refs/heads/main by this push:
new 5eef85e2cea Update task lifecycle diagram for missing states (#46056)
5eef85e2cea is described below
commit 5eef85e2ceaa6d5b52a790db358c5ee066401f95
Author: Po-Yu Hsieh <[email protected]>
AuthorDate: Thu Mar 20 15:36:12 2025 +0800
Update task lifecycle diagram for missing states (#46056)
* Updates task lifecycle diagram for missing states
Details:
- Adds missing states to the diagram ("skipped" and
"deferred")
- Adds missing condition nodes to state transition
- Adds "trigger" component to the diagram
- Coverts multiple conditional branch in the graph into
sequence (multi-staged) of binary conditional branches
- Split states into three categories: "shared states",
"states only for sensors", and "states only for deferrable
tasks"
- Uses different icons to differentiate task states and Airflow
components
* Adds transitions for triggerer tasks
Details:
- Adds conditions and transitions for tasks that only require
triggerer to execute
* Make cycles look better
* [WIP] Initial attempt for generating diagram
Details:
- Adds icon images to docs/diagrams
- Adds 'diagram_task_lifecycle_diagram' script to generate
lifecycle diagram with `diagrams`
- Initial attempt to build up the graph, without label for
condition branch edges, and some redundant edges
* Adjusts branch labels; Adds label to condition branches
Details:
- Adjusts branch label location by padding newline and
whitespace
- Uses `Edge` constructor to add label for condition branch edges
- Removes redundant edges and correct incorrect edge directions
* Remove original image; Update documentation
* Adds legend cluster
* File renaming
Details:
- "diagram_task_lifecycle_diagram" > "diagram_task_lifecycle"
---
docs/apache-airflow/core-concepts/tasks.rst | 2 +-
.../img/diagram_task_lifecycle.md5sum | 1 +
docs/apache-airflow/img/diagram_task_lifecycle.png | Bin 0 -> 529144 bytes
docs/apache-airflow/img/diagram_task_lifecycle.py | 213 +++++++++++++++++++++
docs/apache-airflow/img/task_lifecycle_diagram.png | Bin 32164 -> 0 bytes
docs/diagrams/task_lifecycle/component.png | Bin 0 -> 2096 bytes
docs/diagrams/task_lifecycle/condition.png | Bin 0 -> 1913 bytes
docs/diagrams/task_lifecycle/deferrable_state.png | Bin 0 -> 2625 bytes
docs/diagrams/task_lifecycle/sensor_state.png | Bin 0 -> 2658 bytes
docs/diagrams/task_lifecycle/shared_state.png | Bin 0 -> 2482 bytes
docs/diagrams/task_lifecycle/terminal_state.png | Bin 0 -> 2624 bytes
11 files changed, 215 insertions(+), 1 deletion(-)
diff --git a/docs/apache-airflow/core-concepts/tasks.rst
b/docs/apache-airflow/core-concepts/tasks.rst
index fbfa4ae8203..5e4a7d0786e 100644
--- a/docs/apache-airflow/core-concepts/tasks.rst
+++ b/docs/apache-airflow/core-concepts/tasks.rst
@@ -85,7 +85,7 @@ The possible states for a Task Instance are:
* ``deferred``: The task has been :doc:`deferred to a trigger
<../authoring-and-scheduling/deferring>`
* ``removed``: The task has vanished from the DAG since the run started
-.. image:: /img/task_lifecycle_diagram.png
+.. image:: /img/diagram_task_lifecycle.png
Ideally, a task should flow from ``none``, to ``scheduled``, to ``queued``, to
``running``, and finally to ``success``.
diff --git a/docs/apache-airflow/img/diagram_task_lifecycle.md5sum
b/docs/apache-airflow/img/diagram_task_lifecycle.md5sum
new file mode 100644
index 00000000000..5b00beaa9d8
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_task_lifecycle.md5sum
@@ -0,0 +1 @@
+ef689d2a19fcef658dca32076bb0bfd4
diff --git a/docs/apache-airflow/img/diagram_task_lifecycle.png
b/docs/apache-airflow/img/diagram_task_lifecycle.png
new file mode 100644
index 00000000000..6f5f4e25a30
Binary files /dev/null and b/docs/apache-airflow/img/diagram_task_lifecycle.png
differ
diff --git a/docs/apache-airflow/img/diagram_task_lifecycle.py
b/docs/apache-airflow/img/diagram_task_lifecycle.py
new file mode 100644
index 00000000000..969a4fee324
--- /dev/null
+++ b/docs/apache-airflow/img/diagram_task_lifecycle.py
@@ -0,0 +1,213 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from pathlib import Path
+
+from diagrams import Cluster, Diagram, Edge
+from diagrams.custom import Custom
+from diagrams.programming.flowchart import StartEnd
+from rich.console import Console
+
+MY_DIR = Path(__file__).parent
+MY_FILENAME = Path(__file__).with_suffix("").name
+COMPONENT_IMG = (MY_DIR.parents[1] / "diagrams" / "task_lifecycle" /
"component.png").as_posix()
+CONDITION_IMG = (MY_DIR.parents[1] / "diagrams" / "task_lifecycle" /
"condition.png").as_posix()
+SHARED_STATE_IMG = (MY_DIR.parents[1] / "diagrams" / "task_lifecycle" /
"shared_state.png").as_posix()
+TERMINAL_STATE_IMG = (MY_DIR.parents[1] / "diagrams" / "task_lifecycle" /
"terminal_state.png").as_posix()
+SENSOR_STATE_IMG = (MY_DIR.parents[1] / "diagrams" / "task_lifecycle" /
"sensor_state.png").as_posix()
+DEFERRABLE_STATE_IMG = (MY_DIR.parents[1] / "diagrams" / "task_lifecycle" /
"deferrable_state.png").as_posix()
+
+STATE_NODE_ATTRS = {"width": "4.16", "height": "1", "fontname": "Monospace",
"fontsize": "20"}
+COMPONENT_NODE_ATTRS = {
+ "width": "3.29",
+ "height": "1",
+ "fontname": "Sans-Serif",
+ "fontsize": "28",
+ "fontcolor": "#FFFFFF",
+}
+CONDITION_NODE_ATTRS = {
+ "width": "0.9",
+ "height": "0.9",
+ "labelloc": "t",
+ "fontname": "Sans-Serif",
+ "margin": "1.5,0.5",
+}
+START_NODE_ATTRS = {"width": "4", "height": "4", "fontname": "Sans-Serif",
"fontsize": "28"}
+LEGEND_NODE_ATTRS = {"fontname": "Sans-Serif", "fontsize": "24", "labelloc":
"t"}
+
+console = Console(width=400, color_system="standard")
+
+graph_attr = {
+ "concentrate": "false",
+ "splines": "true",
+}
+
+node_attr = {
+ "labelloc": "c",
+}
+
+edge_attr = {
+ "minlen": "2",
+ "penwidth": "4.0",
+ "labelloc": "t",
+ "fontsize": "14",
+ "fontname": "Sans-Serif",
+}
+
+
+def generate_task_lifecycle_diagram():
+ image_file = (MY_DIR / MY_FILENAME).with_suffix(".png")
+
+ console.print(f"[bright_blue]Generating task lifecycle image {image_file}")
+ with Diagram(
+ name="",
+ show=False,
+ direction="LR",
+ filename=MY_FILENAME,
+ outformat="png",
+ graph_attr=graph_attr,
+ edge_attr=edge_attr,
+ node_attr=node_attr,
+ ):
+ state_none = Custom("none", SHARED_STATE_IMG, **STATE_NODE_ATTRS)
+ state_removed = Custom("removed", TERMINAL_STATE_IMG,
**STATE_NODE_ATTRS)
+ state_upstream_failed = Custom("upstream_failed", TERMINAL_STATE_IMG,
**STATE_NODE_ATTRS)
+ state_skipped = Custom("skipped", TERMINAL_STATE_IMG,
**STATE_NODE_ATTRS)
+ state_scheduled = Custom("scheduled", SHARED_STATE_IMG,
**STATE_NODE_ATTRS)
+ state_queued = Custom("queued", SHARED_STATE_IMG, **STATE_NODE_ATTRS)
+ state_deferred = Custom("deferred", DEFERRABLE_STATE_IMG,
**STATE_NODE_ATTRS)
+ state_running = Custom("running", SHARED_STATE_IMG, **STATE_NODE_ATTRS)
+ state_up_for_reschedule = Custom("up_for_reschedule",
SENSOR_STATE_IMG, **STATE_NODE_ATTRS)
+ state_restarting = Custom("restarting", SHARED_STATE_IMG,
**STATE_NODE_ATTRS)
+ state_up_for_retry = Custom("up_for_retry", SHARED_STATE_IMG,
**STATE_NODE_ATTRS)
+ state_failed = Custom("failed", TERMINAL_STATE_IMG, **STATE_NODE_ATTRS)
+ state_success = Custom("success", TERMINAL_STATE_IMG,
**STATE_NODE_ATTRS)
+
+ component_scheduler = Custom("Scheduler", COMPONENT_IMG,
**COMPONENT_NODE_ATTRS)
+ component_executor = Custom("Executor", COMPONENT_IMG,
**COMPONENT_NODE_ATTRS)
+ component_triggerer = Custom("Triggerer", COMPONENT_IMG,
**COMPONENT_NODE_ATTRS)
+ component_worker = Custom("Worker", COMPONENT_IMG,
**COMPONENT_NODE_ATTRS)
+
+ start_node = StartEnd("Start", **START_NODE_ATTRS)
+
+ cond_upstream_task_failure = Custom(
+ "\n\n\n\n\nRequired upstream task(s) failed?", CONDITION_IMG,
**CONDITION_NODE_ATTRS
+ )
+ cond_scheduled_skip = Custom(
+ "\n\n\n\n\nTask should be skipped?", CONDITION_IMG,
**CONDITION_NODE_ATTRS
+ )
+ cond_task_def_existence = Custom(
+ "\n\n\n\n\nTask instance is still available?", CONDITION_IMG,
**CONDITION_NODE_ATTRS
+ )
+ cond_task_restore = Custom(
+ "\n\n\n\n\n Task instance got
restored?",
+ CONDITION_IMG,
+ **CONDITION_NODE_ATTRS,
+ )
+ cond_trigger_task_1 = Custom(
+ "\n\n\n\n\nA triggerer task?\n(can execute just only with
triggerer)",
+ CONDITION_IMG,
+ **CONDITION_NODE_ATTRS,
+ )
+ cond_trigger_task_2 = Custom("\n\n\n\n\nA triggerer task?",
CONDITION_IMG, **CONDITION_NODE_ATTRS)
+ cond_task_complete_1 = Custom(
+ "\n\n\n\n\n Task completes?", CONDITION_IMG,
**CONDITION_NODE_ATTRS
+ )
+ cond_task_complete_2 = Custom("\n\n\n\n\nTask completes?",
CONDITION_IMG, **CONDITION_NODE_ATTRS)
+ cond_defer_signal_raised = Custom(
+ "\n\n\n\n\nTask is deferrable,\nand defer signal is raised?",
+ CONDITION_IMG,
+ **CONDITION_NODE_ATTRS,
+ )
+ cond_skip_signal = Custom("\n\n\n\n\nSkip signal is raised?",
CONDITION_IMG, **CONDITION_NODE_ATTRS)
+ cond_sensor_reschedule = Custom(
+ "\n\n\n\n\nTask is a sensor in reschedule mode,\nand result is
undetermined?",
+ CONDITION_IMG,
+ **CONDITION_NODE_ATTRS,
+ )
+ cond_fail_mark = Custom(
+ "\n\n\n\n\nTask marked as failed?",
+ CONDITION_IMG,
+ **CONDITION_NODE_ATTRS,
+ )
+ cond_clear_mark = Custom("\n\n\n\n\nTask marked as cleared?",
CONDITION_IMG, **CONDITION_NODE_ATTRS)
+ cond_task_error = Custom("\n\n\n\n\nTask got error?", CONDITION_IMG,
**CONDITION_NODE_ATTRS)
+ cond_retriable = Custom("\n\n\n\n\nEligible for retry?",
CONDITION_IMG, **CONDITION_NODE_ATTRS)
+
+ start_node >> state_none >> component_scheduler >>
cond_upstream_task_failure
+ cond_upstream_task_failure >> Edge(label="NO") >> state_upstream_failed
+ cond_upstream_task_failure >> Edge(label="YES") >> cond_scheduled_skip
+ cond_scheduled_skip >> Edge(label="NO") >> cond_task_def_existence
+ (cond_scheduled_skip >> Edge(label="YES") >> state_skipped,)
+ (cond_task_def_existence >> Edge(label="NO") >> state_removed,)
+ (cond_task_def_existence >> Edge(label="YES") >> state_scheduled,)
+ state_removed >> cond_task_restore
+ cond_task_restore >> Edge(label="NO") >> state_removed
+ cond_task_restore >> Edge(label="YES") >> state_none
+ state_scheduled >> component_executor >> state_queued >>
cond_trigger_task_1
+ cond_trigger_task_1 >> Edge(label="NO") >> component_worker
+ cond_trigger_task_1 >> Edge(label="YES") >> component_triggerer
+ component_triggerer >> state_deferred >> cond_trigger_task_2
+ cond_trigger_task_2 >> Edge(label="NO") >> component_scheduler
+ cond_trigger_task_2 >> Edge(label="YES") >> cond_task_complete_1
+ cond_task_complete_1 >> Edge(label="NO") >> state_deferred
+ cond_task_complete_1 >> Edge(label="YES") >> cond_defer_signal_raised
+ component_worker >> state_running >> cond_defer_signal_raised
+ cond_defer_signal_raised >> Edge(label="NO") >> cond_skip_signal
+ cond_defer_signal_raised >> Edge(label="YES") >> component_triggerer
+ cond_skip_signal >> Edge(label="NO") >> cond_sensor_reschedule
+ cond_skip_signal >> Edge(label="YES") >> state_skipped
+ cond_sensor_reschedule >> Edge(label="NO") >> cond_fail_mark
+ cond_sensor_reschedule >> Edge(label="YES") >> state_up_for_reschedule
>> component_scheduler
+ cond_fail_mark >> Edge(label="NO") >> cond_clear_mark
+ cond_fail_mark >> Edge(label="YES") >> state_failed
+ cond_clear_mark >> Edge(label="NO") >> cond_task_error
+ cond_clear_mark >> Edge(label="YES") >> state_restarting >>
state_up_for_retry
+ cond_task_error >> Edge(label="NO") >> cond_task_complete_2
+ cond_task_error >> Edge(label="YES") >> cond_retriable
+ cond_retriable >> Edge(label="NO") >> state_failed
+ cond_retriable >> Edge(label="YES") >> state_up_for_retry
+ state_up_for_retry >> component_scheduler
+ cond_task_complete_2 >> Edge(label="NO") >> state_running
+ cond_task_complete_2 >> Edge(label="YES") >> state_success
+
+ with Cluster("", graph_attr={"margin": "40,40"}):
+ Custom("\n\nCondition", CONDITION_IMG, width="0.7", height="0.7",
**LEGEND_NODE_ATTRS)
+ Custom(
+ "\n\nState for Deferrable Tasks",
+ DEFERRABLE_STATE_IMG,
+ width="3.2",
+ height="0.77",
+ **LEGEND_NODE_ATTRS,
+ )
+ Custom(
+ "\n\nState for Sensors",
+ SENSOR_STATE_IMG,
+ width="3.2",
+ height="0.77",
+ **LEGEND_NODE_ATTRS,
+ )
+ Custom("\n\nTerminal State", TERMINAL_STATE_IMG, width="3.2",
height="0.77", **LEGEND_NODE_ATTRS)
+ Custom("\n\nShared State", SHARED_STATE_IMG, width="3.2",
height="0.77", **LEGEND_NODE_ATTRS)
+ Custom("\n\nComponent", COMPONENT_IMG, width="2.53",
height="0.77", **LEGEND_NODE_ATTRS)
+
+ console.print(f"[green]Generating architecture image {image_file}")
+
+
+if __name__ == "__main__":
+ generate_task_lifecycle_diagram()
diff --git a/docs/apache-airflow/img/task_lifecycle_diagram.png
b/docs/apache-airflow/img/task_lifecycle_diagram.png
deleted file mode 100644
index 2114d7f5d72..00000000000
Binary files a/docs/apache-airflow/img/task_lifecycle_diagram.png and /dev/null
differ
diff --git a/docs/diagrams/task_lifecycle/component.png
b/docs/diagrams/task_lifecycle/component.png
new file mode 100644
index 00000000000..e4a9f9321e1
Binary files /dev/null and b/docs/diagrams/task_lifecycle/component.png differ
diff --git a/docs/diagrams/task_lifecycle/condition.png
b/docs/diagrams/task_lifecycle/condition.png
new file mode 100644
index 00000000000..75e5977349d
Binary files /dev/null and b/docs/diagrams/task_lifecycle/condition.png differ
diff --git a/docs/diagrams/task_lifecycle/deferrable_state.png
b/docs/diagrams/task_lifecycle/deferrable_state.png
new file mode 100644
index 00000000000..c77dcf76014
Binary files /dev/null and b/docs/diagrams/task_lifecycle/deferrable_state.png
differ
diff --git a/docs/diagrams/task_lifecycle/sensor_state.png
b/docs/diagrams/task_lifecycle/sensor_state.png
new file mode 100644
index 00000000000..ed06cf74d39
Binary files /dev/null and b/docs/diagrams/task_lifecycle/sensor_state.png
differ
diff --git a/docs/diagrams/task_lifecycle/shared_state.png
b/docs/diagrams/task_lifecycle/shared_state.png
new file mode 100644
index 00000000000..950d7296601
Binary files /dev/null and b/docs/diagrams/task_lifecycle/shared_state.png
differ
diff --git a/docs/diagrams/task_lifecycle/terminal_state.png
b/docs/diagrams/task_lifecycle/terminal_state.png
new file mode 100644
index 00000000000..4a6a5dbe962
Binary files /dev/null and b/docs/diagrams/task_lifecycle/terminal_state.png
differ