What's the advantage of this change? Performance?

Alek

On Mon, Nov 27, 2017 at 1:11 PM, [email protected] <
[email protected]> wrote:

> Hi all,
>
> I wanted to gauge community interest in this idea we have. We are
> currently running a modified version of Airflow 1.9 RC3 where we ignore
> processing DAG definition Python files for paused DAGs. By default,
> list_py_file_paths traverses the dags subdirectory to look for Python
> files, and the scheduler processes all these files, regardless of whether
> the DAGs defined in these files are paused or not. Our proposed
> modification was to query the fileloc column in the dag table, filtering
> on is_paused=1 and is_active=1 to get a list of file paths for paused DAGs.
> Then, we can exclude these files from the known_file_paths, so that the
> scheduler does not process these files. This feature can be set on and off
> via a scheduler config variable.
>
> If anyone is interested, we already have the code written, so we'd be
> happy to package up our changes and create a PR.
>
> Thanks!
> -Andy
>

Reply via email to