joelossher opened a new issue, #24286:
URL: https://github.com/apache/airflow/issues/24286

   ### Apache Airflow version
   
   2.2.5
   
   ### What happened
   
   We deleted the file for an old DAG and since then our Airflow instance has 
been reporting an import error to Datadog via the 
`airflow.dag_processing.import_errors` gauge metric.
   
   Looking at the DAG processor manager logs, we see the following:
   ```
   /usr/local/airflow/dags/etl_facebook_comments.py                             
                                    1           0  12.29s          
2022-06-06T20:40:22
   ...
   [2022-06-06 20:42:36,488] {manager.py:548} INFO - DAG etl_facebook_comments 
is missing and will be deactivated.
   [2022-06-06 20:42:36,492] {manager.py:558} INFO - Deactivated 1 DAGs which 
are no longer present in file.
   ...
   /usr/local/airflow/dags/etl_facebook_comments.py                             
                                    0           1  11.04s          
2022-06-06T20:42:29
   ```
   
   after a little bit `etl_facebook_comments.py` stops appearing in 
`dag_processor_manager.log` but the metric persists with the value 1.
   
   As expected, no errors show up in the UI and `select * from import_error` 
returns nothing.
   
   ### What you think should happen instead
   
   The gauge metric should not get stuck reporting an import error for a DAG 
that no longer exists.
   
   Given my brief look through the code, I expect there's some bug that 
prevented the DAG from being removed from `_file_stats` in the 
`DagFileProcessorManager`.
   
   It looks like this should be handled in `set_file_paths`, but I'm not seeing 
any log messages for `self.log.warning("Stopping processor for %s", 
file_path)`. There seems to be the assumption that an active processor would 
need to be removed, but I suspect in this case there's no processor so it never 
gets purged.
   
   ### How to reproduce
   
   Reproducing this might be as simple as deleting a DAG where statsd is set up.
   
   But I suspect that it may matter what phase the dag processor manager is in, 
so it may take a few attempts.
   
   ### Operating System
   
   Debian GNU/Linux 11 (bullseye)
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-amazon= "==3.2.0"
   apache-airflow-providers-databricks= "==2.5.0"
   apache-airflow-providers-datadog= "==2.0.4"
   apache-airflow-providers-facebook= "==2.2.3"
   apache-airflow-providers-ftp= "==2.1.2"
   apache-airflow-providers-http= "==2.1.2"
   apache-airflow-providers-imap= "==2.2.3"
   apache-airflow-providers-mysql= "==2.2.3"
   apache-airflow-providers-postgres= "==4.1.0"
   apache-airflow-providers-slack= "==4.2.3"
   apache-airflow-providers-sqlite= "==2.1.3"
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   Custom docker container running on ECS.
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to