Hi all,

I wanted to gauge community interest in this idea we have. We are currently 
running a modified version of Airflow 1.9 RC3 where we ignore processing DAG 
definition Python files for paused DAGs. By default, list_py_file_paths 
traverses the dags subdirectory to look for Python files, and the scheduler 
processes all these files, regardless of whether the DAGs defined in these 
files are paused or not. Our proposed
modification was to query the fileloc column in the dag table, filtering on 
is_paused=1 and is_active=1 to get a list of file paths for paused DAGs. Then, 
we can exclude these files from the known_file_paths, so that the scheduler 
does not process these files. This feature can be set on and off via a 
scheduler config variable.

If anyone is interested, we already have the code written, so we'd be happy to 
package up our changes and create a PR.

Thanks!
-Andy

Reply via email to