Cedric Hourcade created AIRFLOW-1729:
----------------------------------------
Summary: Ignore whole directories in .airflowignore
Key: AIRFLOW-1729
URL: https://issues.apache.org/jira/browse/AIRFLOW-1729
Project: Apache Airflow
Issue Type: Improvement
Components: core
Affects Versions: Airflow 2.0
Reporter: Cedric Hourcade
Priority: Minor
The .airflowignore file allows to prevent scanning files for DAG. But even if
we blacklist fulldirectory the {{os.walk}} will still go through them no matter
how deep they are and skip files one by one, which can be an issue when you
keep around big .git or virtualvenv directories.
I suggest to add something like:
{code}
dirs[:] = [d for d in dirs if not any([re.findall(p, os.path.join(root, d)) for
p in patterns])]
{code}
to prune the directories here:
https://github.com/apache/incubator-airflow/blob/cfc2f73c445074e1e09d6ef6a056cd2b33a945da/airflow/utils/dag_processing.py#L208-L209
and in {{list_py_file_paths}}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)