[ 
https://issues.apache.org/jira/browse/AIRFLOW-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090956#comment-17090956
 ] 

ASF GitHub Bot commented on AIRFLOW-6796:
-----------------------------------------

kaxil commented on a change in pull request #7424:
URL: https://github.com/apache/airflow/pull/7424#discussion_r414148564



##########
File path: airflow/config_templates/config.yml
##########
@@ -1367,6 +1367,14 @@
       type: string
       example: ~
       default: "30"
+    - name: dag_cleanup_interval
+      description: |
+        How often unseen Dags should be marked inactive and their 
serializations deleted.
+        Setting to 0 will disable DAG cleanup.  Defaults to 5 minutes

Review comment:
       ```suggestion
           Setting to 0 will disable DAG cleanup. Defaults to 5 minutes
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


> Serialized DAGs can be incorrectly deleted
> ------------------------------------------
>
>                 Key: AIRFLOW-6796
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6796
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: serialization
>    Affects Versions: 1.10.9
>            Reporter: Matthew Bruce
>            Priority: Major
>
> With serialization of DAGs enabled, `SerializedDagModel.remove_deleted_dags` 
> called from `DagFileProcessManager.refresh_dag_dir` can delete the 
> serialization of DAGs if they were loaded via a DagBag and globals in a 
> different `.py` file:
> Consider something like this:
>  {{/home/airflow/dags/loader.py}}
> {code:python}
> dag_bags = []
> dag_bags.append(models.DagBag('/home/airflow/project-a/dags')
> dag_bags.append(models.DagBag('/home/airflow/project-b/dags')
> for dag_bag in dag_bags:
>     for dag in dag_bag:
>       globals()[dag.dag_id] = dag{code}
> with files:
> {code:java}
> /home/airflow/project-a/dags/dag-a.py
> /home/airflow/project-b/dags/dag-b.py
> {code}
>  
> The list of file paths passed to {{SerializedDagModel.remove_deleted_dags}} 
> is only going to contain {{/home/airflow/dags/loader.py}} and the method will 
> remove the serializations for the DAGs in dag-a.py and dag-b.py
> With non-serialized DAGs, airflow seems to mark DAGs as inactive based on 
> when the scheduler last processed them - I wonder if we should make these two 
> methods consistent?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to