ben-astro opened a new issue #17861:
URL: https://github.com/apache/airflow/issues/17861


   **Description**
   
   I would like to see an error in Airflow UI (like ``` "Broken DAG: duplicate 
DAG-id" ```) when a DAG is present that has the same (non-unique) DAG-id as 
another DAG. Currently, Airflow silently ignores one of these duplicated DAGs. 
   
   **Use case / motivation**
   For users that either knowingly or unknowingly have multiple DAGs with the 
same DAG-id (which could happen especially in the case of dynamically 
generating DAGs), unintended/weird functionality can result as Airflow tries to 
figure out which DAG is the right one. It is not clear how Airflow will choose, 
and if that choice can change each time it parses for DAG changes.  
   
   **Additional Info**
   Currently in Airflow, my understanding is that the DAG file processor 
creates a new process and a new DagBag per file. As a result each DagBag only 
parses one DAG-script and thus DAGs with duplicate ids in different scripts are 
never detected. 
https://github.com/apache/airflow/blob/main/airflow/dag_processing/processor.py#L613
   
   Additionally, there's already a test and exception 
(```AirflowDagDuplicatedIdException```) in place for this, but it depends on a 
single DagBag having multiple DAGs in it, which isn't the case when actually 
running Airflow, from discussion with @BasPH and @uranusjr. 
   
https://github.com/apache/airflow/blob/2.1.3/airflow/models/dagbag.py#L405-L409
   
https://github.com/apache/airflow/blob/main/tests/models/test_dagbag.py#L143-L173
   
   **Related Issues**
   
   I did not find other issues related to this, when searching for "duplicate 
DAG" or "duplicate DAG-id".


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to