[
https://issues.apache.org/jira/browse/AIRFLOW-6171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16992134#comment-16992134
]
Andrey Klochkov edited comment on AIRFLOW-6171 at 12/10/19 2:49 AM:
--------------------------------------------------------------------
This is happening due to the following defect in
{{dag_processing.list_py_file_paths}}. In the loop that walks through
subdirectories the same object {{patterns}} is written to dictionary
{{patterns_by_dir under different keys. When the loop goes through the top
level dags directory, it puts the same object }}{{patterns}} under keys
corresponding to each of the subdirectories. Then when the look goes through
subdirectories it fetches the same list from the map and so airflowignore
present in one of the directories effectively is applied to all other
subdirectories processed later.
The fix is to add ".copy()" as shown here:
{code:java}
# We want patterns defined in a parent folder's .airflowignore to
# apply to subdirs too
for d in dirs:
patterns_by_dir[os.path.join(root, d)] = patterns.copy() {code}
was (Author: aklochkov):
This is happening due to the following defect in
{{dag_processing.list_py_file_paths}}. In the look that walks through
subdirectories the same object {{patterns}} is written to dictionary
{{patterns_by_dir under different keys. When the loop goes through the top
level dags directory, it puts the same object }}{{patterns}} under keys
corresponding to each of the subdirectories. Then when the look goes through
subdirectories it fetches the same list from the map and so airflowignore
present in one of the directories effectively is applied to all other
subdirectories processed later.
The fix is to add ".copy()" as shown here:
{code:java}
# We want patterns defined in a parent folder's .airflowignore to
# apply to subdirs too
for d in dirs:
patterns_by_dir[os.path.join(root, d)] = patterns.copy() {code}
> airflow ignore file with .* located in a subdirectory ignores dags in other
> dirs
> --------------------------------------------------------------------------------
>
> Key: AIRFLOW-6171
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6171
> Project: Apache Airflow
> Issue Type: Bug
> Components: core, DAG
> Affects Versions: 1.10.5, 1.10.6
> Environment: Ubuntu 18.04
> Reporter: Andrey Kateshov
> Priority: Major
>
> I have an airflow dags directory looking like this: x/... y/... z/.... I.e.
> all dags are placed in subdirectories.
> If I place an .airflowignore with a single line of .* in directory z/ the
> dags in other directories (e.g x/ and y/) are also ignored. Which is already
> a big issue. What makes it even stranger that only some of them are ignored,
> potentially masking the effects of this behaviour.
> What makes it even worse you won't see that these dags are now disabled in
> airflow UI unless you completely restart it(possibly together with the
> scheduler, we restarted both, didn't try to see if only the UI is enough).
> This issue was not present in 1.10.3, but appears in 1.10.5. I didn't test
> 1.10.4.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)