This is an automated email from the ASF dual-hosted git repository.

rahulvats pushed a commit to branch v3-2-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v3-2-test by this push:
     new abdd222cd1a [v3-2-test] [DOCS]add guide for dag version inflation and 
it's checker (#64100) (#64461)
abdd222cd1a is described below

commit abdd222cd1aab712385707d3a7ea1cc8833a8b75
Author: github-actions[bot] 
<41898282+github-actions[bot]@users.noreply.github.com>
AuthorDate: Mon Mar 30 14:55:34 2026 +0530

    [v3-2-test] [DOCS]add guide for dag version inflation and it's checker 
(#64100) (#64461)
    
    * add docs for dag version inflation
    
    * fix docs
    
    * fix docs
    (cherry picked from commit 15a1ef8cc4873b2e12e89616ec49d303176bcf76)
    
    Co-authored-by: Jeongwoo Do <[email protected]>
---
 airflow-core/docs/best-practices.rst |   2 +
 airflow-core/docs/faq.rst            | 163 +++++++++++++++++++++++++++++++++++
 2 files changed, 165 insertions(+)

diff --git a/airflow-core/docs/best-practices.rst 
b/airflow-core/docs/best-practices.rst
index d58d5c48fe0..cd0f102d7ef 100644
--- a/airflow-core/docs/best-practices.rst
+++ b/airflow-core/docs/best-practices.rst
@@ -296,6 +296,8 @@ When you execute that code you will see:
 
 This means that the ``get_array`` is not executed as top-level code, but 
``get_task_id`` is.
 
+.. _best_practices/code_quality_and_linting:
+
 Code Quality and Linting
 ------------------------
 
diff --git a/airflow-core/docs/faq.rst b/airflow-core/docs/faq.rst
index f57f2ddebf2..fe48c3695dc 100644
--- a/airflow-core/docs/faq.rst
+++ b/airflow-core/docs/faq.rst
@@ -237,6 +237,169 @@ There are several reasons why Dags might disappear from 
the UI. Common causes in
 * **Time synchronization issues** - Ensure all nodes (database, schedulers, 
workers) use NTP with <1s clock drift.
 
 
+.. _faq:dag-version-inflation:
+
+Why does my Dag version keep increasing?
+-----------------------------------------
+
+Every time the Dag processor parses a Dag file, it serializes the Dag and 
compares the result with the
+version stored in the metadata database. If anything has changed, Airflow 
creates a new Dag version.
+
+**Dag version inflation** occurs when the version number increases 
indefinitely without the Dag author
+making any intentional changes.
+
+What goes wrong
+"""""""""""""""
+
+When Dag versions increase without meaningful changes:
+
+* The metadata database accumulates unnecessary Dag version records, 
increasing storage and query overhead.
+* The UI shows a misleading history of Dag changes, making it harder to 
identify real modifications.
+* The scheduler and API server may consume more memory as they load and cache 
a growing number of Dag versions.
+
+Common causes
+"""""""""""""
+
+Version inflation is caused by using values that change at **parse time** — 
that is, every time the Dag
+processor evaluates the Dag file — as arguments to Dag or Task constructors. 
The most common patterns are:
+
+**1. Using ``datetime.now()`` or ``pendulum.now()`` as ``start_date``:**
+
+.. code-block:: python
+
+    from datetime import datetime
+
+    from airflow.sdk import DAG
+
+    with DAG(
+        dag_id="bad_example",
+        # BAD: datetime.now() produces a different value on every parse
+        start_date=datetime.now(),
+        schedule="@daily",
+    ):
+        ...
+
+Every parse produces a different ``start_date``, so the serialized Dag is 
always different from the
+stored version.
+
+**2. Using random values in Dag or Task arguments:**
+
+.. code-block:: python
+
+    import random
+
+    from airflow.sdk import DAG
+    from airflow.providers.standard.operators.python import PythonOperator
+
+    with DAG(dag_id="bad_random", start_date="2024-01-01", schedule="@daily") 
as dag:
+        PythonOperator(
+            # BAD: random value changes every parse
+            task_id=f"task_{random.randint(1, 1000)}",
+            python_callable=lambda: None,
+        )
+
+**3. Assigning runtime-varying values to variables used in constructors:**
+
+.. code-block:: python
+
+    from datetime import datetime
+
+    from airflow.sdk import DAG
+    from airflow.providers.standard.operators.python import PythonOperator
+
+    # BAD: the variable captures a parse-time value, then is passed to the DAG
+    default_args = {"start_date": datetime.now()}
+
+    with DAG(dag_id="bad_defaults", default_args=default_args, 
schedule="@daily") as dag:
+        PythonOperator(task_id="my_task", python_callable=lambda: None)
+
+Even though ``datetime.now()`` is not called directly inside the Dag 
constructor, it flows in through
+``default_args`` and still causes a different serialized Dag on every parse.
+
+**4. Using environment variables or file contents that change between parses:**
+
+.. code-block:: python
+
+    import os
+
+    from airflow.sdk import DAG
+    from airflow.providers.standard.operators.bash import BashOperator
+
+    with DAG(dag_id="bad_env", start_date="2024-01-01", schedule="@daily") as 
dag:
+        BashOperator(
+            task_id="echo_build",
+            # BAD if BUILD_NUMBER changes on every deployment or parse
+            bash_command=f"echo {os.environ.get('BUILD_NUMBER', 'unknown')}",
+        )
+
+How to avoid version inflation
+""""""""""""""""""""""""""""""
+
+* **Use fixed ``start_date`` values.** Always set ``start_date`` to a static 
``datetime`` literal:
+
+  .. code-block:: python
+
+      import datetime
+
+      from airflow.sdk import DAG
+
+      with DAG(
+          dag_id="good_example",
+          start_date=datetime.datetime(2024, 1, 1),
+          schedule="@daily",
+      ):
+          ...
+
+* **Keep all Dag and Task constructor arguments deterministic.** Arguments 
passed to Dag and Operator
+  constructors must produce the same value on every parse. Move any dynamic 
computation into the
+  ``execute()`` method or use Jinja templates, which are evaluated at task 
execution time rather than
+  parse time.
+
+* **Use Jinja templates for dynamic values:**
+
+  .. code-block:: python
+
+      from airflow.providers.standard.operators.bash import BashOperator
+
+      BashOperator(
+          task_id="echo_date",
+          # GOOD: the template is resolved at execution time, not parse time
+          bash_command="echo {{ ds }}",
+      )
+
+* **Use Airflow Variables with templates instead of top-level lookups:**
+
+  .. code-block:: python
+
+      from airflow.providers.standard.operators.bash import BashOperator
+
+      BashOperator(
+          task_id="echo_var",
+          # GOOD: Variable is resolved at execution time via template
+          bash_command="echo {{ var.value.my_variable }}",
+      )
+
+Dag version inflation detection
+""""""""""""""""""""""""""""""""
+
+Starting from Airflow 3.2, the Dag processor performs **AST-based static 
analysis** on every Dag file
+before parsing to detect runtime-varying values in Dag and Task constructors. 
When a potential issue is
+found, it is surfaced as a **Dag warning** visible in the UI.
+
+You can control this behavior with the
+:ref:`dag_version_inflation_check_level 
<config:dag_processor__dag_version_inflation_check_level>`
+configuration option:
+
+* ``off`` — Disables the check entirely. No errors or warnings are generated.
+* ``warning`` (default) — Dags load normally but warnings are displayed in the 
UI when issues are detected.
+* ``error`` — Treats detected issues as Dag import errors, preventing the Dag 
from loading.
+
+Additionally, you can catch these issues earlier in your development workflow 
by using the
+`AIR302 <https://docs.astral.sh/ruff/rules/airflow3-dag-dynamic-value/>`_ ruff 
rule, which detects
+dynamic values in Dag and Task constructors as part of static linting. See
+:ref:`best_practices/code_quality_and_linting` for how to set up ruff with 
Airflow-specific rules.
+
+
 Dag construction
 ^^^^^^^^^^^^^^^^
 

Reply via email to