Casey created AIRFLOW-2167:
------------------------------

             Summary: Scheduler's clear_nonexistent_import_errors function 
should be called on first iteration
                 Key: AIRFLOW-2167
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2167
             Project: Apache Airflow
          Issue Type: Bug
          Components: scheduler
    Affects Versions: 1.9.0
            Reporter: Casey
            Assignee: Casey
         Attachments: Screen Shot 2018-03-02 at 2.08.29 PM.png

In `airflow/jobs.py`, the `**clear_nonexistent_import_errors` function is not 
called until the amount of seconds defined by `dag_dir_list_interval` has 
elapsed.  If the scheduler is not alive for the duration of 
`dag_dir_list_interval` (300 seconds) this cleanup never occurs.  In some 
environments this could result in error messages displaying on the UI 
permanently, even if the DAG has been removed from the environment.

It was previously an Airflow best practice to have the scheduler run N runtimes 
and terminate.  Then, the scheduler would started again by an auxiliary process 
like Docker or Supervisor.  This situation is what brought the bug to my 
attention.

My suggested fix is to tweak jobs.py to run the import error cleanup on the 
first iteration and periodically as defined by `dag_dir_list_interval`.  This 
way, a scheduler setup with a small number of runs will still have old errors 
cleaned up.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to