This is an automated email from the ASF dual-hosted git repository.
rahulvats pushed a commit to branch v3-2-test
in repository https://gitbox.apache.org/repos/asf/airflow.git
The following commit(s) were added to refs/heads/v3-2-test by this push:
new abdd222cd1a [v3-2-test] [DOCS]add guide for dag version inflation and
it's checker (#64100) (#64461)
abdd222cd1a is described below
commit abdd222cd1aab712385707d3a7ea1cc8833a8b75
Author: github-actions[bot]
<41898282+github-actions[bot]@users.noreply.github.com>
AuthorDate: Mon Mar 30 14:55:34 2026 +0530
[v3-2-test] [DOCS]add guide for dag version inflation and it's checker
(#64100) (#64461)
* add docs for dag version inflation
* fix docs
* fix docs
(cherry picked from commit 15a1ef8cc4873b2e12e89616ec49d303176bcf76)
Co-authored-by: Jeongwoo Do <[email protected]>
---
airflow-core/docs/best-practices.rst | 2 +
airflow-core/docs/faq.rst | 163 +++++++++++++++++++++++++++++++++++
2 files changed, 165 insertions(+)
diff --git a/airflow-core/docs/best-practices.rst
b/airflow-core/docs/best-practices.rst
index d58d5c48fe0..cd0f102d7ef 100644
--- a/airflow-core/docs/best-practices.rst
+++ b/airflow-core/docs/best-practices.rst
@@ -296,6 +296,8 @@ When you execute that code you will see:
This means that the ``get_array`` is not executed as top-level code, but
``get_task_id`` is.
+.. _best_practices/code_quality_and_linting:
+
Code Quality and Linting
------------------------
diff --git a/airflow-core/docs/faq.rst b/airflow-core/docs/faq.rst
index f57f2ddebf2..fe48c3695dc 100644
--- a/airflow-core/docs/faq.rst
+++ b/airflow-core/docs/faq.rst
@@ -237,6 +237,169 @@ There are several reasons why Dags might disappear from
the UI. Common causes in
* **Time synchronization issues** - Ensure all nodes (database, schedulers,
workers) use NTP with <1s clock drift.
+.. _faq:dag-version-inflation:
+
+Why does my Dag version keep increasing?
+-----------------------------------------
+
+Every time the Dag processor parses a Dag file, it serializes the Dag and
compares the result with the
+version stored in the metadata database. If anything has changed, Airflow
creates a new Dag version.
+
+**Dag version inflation** occurs when the version number increases
indefinitely without the Dag author
+making any intentional changes.
+
+What goes wrong
+"""""""""""""""
+
+When Dag versions increase without meaningful changes:
+
+* The metadata database accumulates unnecessary Dag version records,
increasing storage and query overhead.
+* The UI shows a misleading history of Dag changes, making it harder to
identify real modifications.
+* The scheduler and API server may consume more memory as they load and cache
a growing number of Dag versions.
+
+Common causes
+"""""""""""""
+
+Version inflation is caused by using values that change at **parse time** —
that is, every time the Dag
+processor evaluates the Dag file — as arguments to Dag or Task constructors.
The most common patterns are:
+
+**1. Using ``datetime.now()`` or ``pendulum.now()`` as ``start_date``:**
+
+.. code-block:: python
+
+ from datetime import datetime
+
+ from airflow.sdk import DAG
+
+ with DAG(
+ dag_id="bad_example",
+ # BAD: datetime.now() produces a different value on every parse
+ start_date=datetime.now(),
+ schedule="@daily",
+ ):
+ ...
+
+Every parse produces a different ``start_date``, so the serialized Dag is
always different from the
+stored version.
+
+**2. Using random values in Dag or Task arguments:**
+
+.. code-block:: python
+
+ import random
+
+ from airflow.sdk import DAG
+ from airflow.providers.standard.operators.python import PythonOperator
+
+ with DAG(dag_id="bad_random", start_date="2024-01-01", schedule="@daily")
as dag:
+ PythonOperator(
+ # BAD: random value changes every parse
+ task_id=f"task_{random.randint(1, 1000)}",
+ python_callable=lambda: None,
+ )
+
+**3. Assigning runtime-varying values to variables used in constructors:**
+
+.. code-block:: python
+
+ from datetime import datetime
+
+ from airflow.sdk import DAG
+ from airflow.providers.standard.operators.python import PythonOperator
+
+ # BAD: the variable captures a parse-time value, then is passed to the DAG
+ default_args = {"start_date": datetime.now()}
+
+ with DAG(dag_id="bad_defaults", default_args=default_args,
schedule="@daily") as dag:
+ PythonOperator(task_id="my_task", python_callable=lambda: None)
+
+Even though ``datetime.now()`` is not called directly inside the Dag
constructor, it flows in through
+``default_args`` and still causes a different serialized Dag on every parse.
+
+**4. Using environment variables or file contents that change between parses:**
+
+.. code-block:: python
+
+ import os
+
+ from airflow.sdk import DAG
+ from airflow.providers.standard.operators.bash import BashOperator
+
+ with DAG(dag_id="bad_env", start_date="2024-01-01", schedule="@daily") as
dag:
+ BashOperator(
+ task_id="echo_build",
+ # BAD if BUILD_NUMBER changes on every deployment or parse
+ bash_command=f"echo {os.environ.get('BUILD_NUMBER', 'unknown')}",
+ )
+
+How to avoid version inflation
+""""""""""""""""""""""""""""""
+
+* **Use fixed ``start_date`` values.** Always set ``start_date`` to a static
``datetime`` literal:
+
+ .. code-block:: python
+
+ import datetime
+
+ from airflow.sdk import DAG
+
+ with DAG(
+ dag_id="good_example",
+ start_date=datetime.datetime(2024, 1, 1),
+ schedule="@daily",
+ ):
+ ...
+
+* **Keep all Dag and Task constructor arguments deterministic.** Arguments
passed to Dag and Operator
+ constructors must produce the same value on every parse. Move any dynamic
computation into the
+ ``execute()`` method or use Jinja templates, which are evaluated at task
execution time rather than
+ parse time.
+
+* **Use Jinja templates for dynamic values:**
+
+ .. code-block:: python
+
+ from airflow.providers.standard.operators.bash import BashOperator
+
+ BashOperator(
+ task_id="echo_date",
+ # GOOD: the template is resolved at execution time, not parse time
+ bash_command="echo {{ ds }}",
+ )
+
+* **Use Airflow Variables with templates instead of top-level lookups:**
+
+ .. code-block:: python
+
+ from airflow.providers.standard.operators.bash import BashOperator
+
+ BashOperator(
+ task_id="echo_var",
+ # GOOD: Variable is resolved at execution time via template
+ bash_command="echo {{ var.value.my_variable }}",
+ )
+
+Dag version inflation detection
+""""""""""""""""""""""""""""""""
+
+Starting from Airflow 3.2, the Dag processor performs **AST-based static
analysis** on every Dag file
+before parsing to detect runtime-varying values in Dag and Task constructors.
When a potential issue is
+found, it is surfaced as a **Dag warning** visible in the UI.
+
+You can control this behavior with the
+:ref:`dag_version_inflation_check_level
<config:dag_processor__dag_version_inflation_check_level>`
+configuration option:
+
+* ``off`` — Disables the check entirely. No errors or warnings are generated.
+* ``warning`` (default) — Dags load normally but warnings are displayed in the
UI when issues are detected.
+* ``error`` — Treats detected issues as Dag import errors, preventing the Dag
from loading.
+
+Additionally, you can catch these issues earlier in your development workflow
by using the
+`AIR302 <https://docs.astral.sh/ruff/rules/airflow3-dag-dynamic-value/>`_ ruff
rule, which detects
+dynamic values in Dag and Task constructors as part of static linting. See
+:ref:`best_practices/code_quality_and_linting` for how to set up ruff with
Airflow-specific rules.
+
+
Dag construction
^^^^^^^^^^^^^^^^