Arunodoy18 opened a new pull request, #60761:
URL: https://github.com/apache/airflow/pull/60761

   Closes #60559
   
   Summary
   Airflow currently allows multiple DAG files to define the same dag_id and 
silently keeps only the last parsed DAG, overwriting the previous one without 
any warning. Since DAG parse order is nondeterministic (especially in 
distributed environments like Composer/GCS), this can lead to unpredictable 
behavior across scheduler runs.
   
   This PR introduces a minimal, backward-compatible safeguard by detecting 
duplicate dag_id collisions during DAGBag parsing and emitting a clear warning 
in the scheduler logs.
   
   What changed
   - Detect when a DAG with an already existing dag_id is being added to the 
DagBag.
   - Log a warning showing:
     • the dag_id  
     • original DAG file location  
     • new DAG file location
   - Preserve existing behavior (no parsing failure or UI change).
   
   Why this is safe
   This mirrors Airflow’s existing duplicate task_id handling pattern (warn 
instead of fail) and improves visibility without breaking backward 
compatibility.
   
   Scope
   Only DAGBag parsing logic is updated. No changes to scheduling, 
serialization, or UI.
   
   Behavior after fix
   Users will now see an explicit scheduler warning whenever duplicate dag_id 
definitions are encountered, preventing silent nondeterministic overrides.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to