kaxil commented on issue #56471:
URL: https://github.com/apache/airflow/issues/56471#issuecomment-3439016976

   As @ephraimbuddy pointed out the codebase already correctly handles 
non-deterministic key ordering in DAG serialization.
   
   
   1. **Key ordering IS non-deterministic across Python processes**
      - When serializing the same DAG in different Python processes, dictionary 
keys appear in different orders
      - This is expected behavior due to Python's hash randomization affecting 
`frozenset` iteration order
   
   2. **BUT this does NOT cause excessive DAG versions**
      - The system uses a **hash-based comparison** to detect changes
      - The `SerializedDagModel.hash()` method sorts all dictionaries before 
hashing
      - Therefore, hashes are deterministic despite varying key orders
      - No new versions are created when the DAG content is unchanged
   
   The relevant code in `airflow-core/src/airflow/models/serialized_dag.py`:
   
   **Hash computation:**
   
   
https://github.com/apache/airflow/blob/5013aad00b3b76e442861f8233c2691845f1fff1/airflow-core/src/airflow/models/serialized_dag.py#L341-L352
   
   
   **Version comparison:**
   
   
https://github.com/apache/airflow/blob/5013aad00b3b76e442861f8233c2691845f1fff1/airflow-core/src/airflow/models/serialized_dag.py#L419-L425
   
   If you have a reproducible dag / script to show the creation of version, 
feel free to re-open this and provide those scripts, thanks.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to