wolvery opened a new issue, #56471:
URL: https://github.com/apache/airflow/issues/56471

   ### Apache Airflow version
   
   3.1.0
   
   ### If "Other Airflow 2 version" selected, which one?
   
   _No response_
   
   ### What happened?
   
   We are experiencing an issue where the dag processor creates a new version 
of a DAG in the serialized_dag table on nearly every parsing cycle, even when 
the underlying DAG file is functionally unchanged. 
   
   The root cause appears to be a non-deterministic serialization process. The 
order of dictionary keys and list elements in the resulting JSON column of 
serialized_dag table is inconsistent between parses. This leads to different 
versions for logically identical DAGs, only the order of keys inside of the 
JSON are changing.
   
   
   Example:
   ```
   --- version_1
   +++ version_2
   @@ -1,13 +1,13 @@
   -"dag_id": "test",
   -"max_consecutive_failed_dag_runs": 7,
   -"timetable": { ... },
   -"relative_fileloc": "revision_dags/test.py",
   -"task_group": { ... },
   -"fileloc": "/opt/airflow/dags/revision_dags/test.py",
   -"timezone": "UTC",
   -"default_args": { ... },
   -"description": "DAG for [domain='test', 
data_product='polaroid_input_features', pipeline='main']",
   -"max_active_runs": 1,
   -"tags": [ ... ],
   -"start_date": 1640995200.0
   +"max_consecutive_failed_dag_runs": 7,
   +"task_group": { ... },
   +"timezone": "UTC",
   +"max_active_runs": 1,
   +"fileloc": "/opt/airflow/dags/revision_dags/test.py",
   +"timetable": { ... },
   +"start_date": 1640995200.0,
   +"description": "DAG for [domain='test', 
data_product='polaroid_input_features', pipeline='main']",
   +"default_args": { ... },
   +"tags": [ ... ],
   +"relative_fileloc": "revision_dags/test.py",
   +"dag_id": "test"
   ```
   
   ### What you think should happen instead?
   
   It should sort the keys internally to perform the comparison and avoid the 
creation of a new version.
   
   ### How to reproduce
   
   Creating a simple dynamic dag and importing in the global seems to lead to 
the problem.
   
   ### Operating System
   
   airflow:3.1.0 python 3.10 image
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   Official Apache Airflow Helm Chart
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to