[jira] [Commented] (AIRFLOW-5523) DAGs never get cleaned up from webservers if DAG file is removed from the scheduler
[ https://issues.apache.org/jira/browse/AIRFLOW-5523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006379#comment-17006379 ] t oo commented on AIRFLOW-5523: --- my thoughts: autodelete dags that are no longer in any dagbagie dynamic dag generator used to make 10 dags, now makes 7...other 3 should be auto deleted...guarded by config option 'auto_delete_dags_not_found' > DAGs never get cleaned up from webservers if DAG file is removed from the > scheduler > --- > > Key: AIRFLOW-5523 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5523 > Project: Apache Airflow > Issue Type: Bug > Components: webserver >Affects Versions: 1.10.5 >Reporter: Dan Davydov >Priority: Major > Attachments: Screen Shot 2019-09-19 at 4.39.24 PM.png > > > DAGs are never removed from the DB if the DAG file used to generate the DAGs > is removed. This causes the webserver to accumulate broken dag rows (see > attached screenshot). > The scheduler should delete all DAGs from the DAG table with files that no > longer exist when it scans the DAGs folder for files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-5523) DAGs never get cleaned up from webservers if DAG file is removed from the scheduler
[ https://issues.apache.org/jira/browse/AIRFLOW-5523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993246#comment-16993246 ] Vishesh Jain commented on AIRFLOW-5523: --- [~chrispalmer16] Here is a discussion thread for this Jira <[https://lists.apache.org/thread.html/53b5a3bc63f39415850cd12c34ee2903aa6d132248d1e02a66816675%40%3Cdev.airflow.apache.org%3E]> > DAGs never get cleaned up from webservers if DAG file is removed from the > scheduler > --- > > Key: AIRFLOW-5523 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5523 > Project: Apache Airflow > Issue Type: Bug > Components: webserver >Affects Versions: 1.10.5 >Reporter: Dan Davydov >Priority: Major > Attachments: Screen Shot 2019-09-19 at 4.39.24 PM.png > > > DAGs are never removed from the DB if the DAG file used to generate the DAGs > is removed. This causes the webserver to accumulate broken dag rows (see > attached screenshot). > The scheduler should delete all DAGs from the DAG table with files that no > longer exist when it scans the DAGs folder for files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-5523) DAGs never get cleaned up from webservers if DAG file is removed from the scheduler
[ https://issues.apache.org/jira/browse/AIRFLOW-5523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990015#comment-16990015 ] Chris Palmer commented on AIRFLOW-5523: --- What is the goal here? If it is to clean up the DAGs shown in the UI then there are other ways to hide those (maybe in the same way that we can hide paused DAGs). If it is to clean up the database then I worry that automating this could have unintended consequences. If I'm not mistaken, currently deleting a DAG via the UI or CLI results in all database records being deleted, including task instances and dag runs. Is the intention of this ticket to do that full DAG deletion, or is it only to delete the row in the dag table? If the former then I think that is a really bad idea. It would only take one small mistake in syncing/deploying your dag files, for the scheduler to delete all your history which could be a major problem. > DAGs never get cleaned up from webservers if DAG file is removed from the > scheduler > --- > > Key: AIRFLOW-5523 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5523 > Project: Apache Airflow > Issue Type: Bug > Components: webserver >Affects Versions: 1.10.5 >Reporter: Dan Davydov >Priority: Major > Attachments: Screen Shot 2019-09-19 at 4.39.24 PM.png > > > DAGs are never removed from the DB if the DAG file used to generate the DAGs > is removed. This causes the webserver to accumulate broken dag rows (see > attached screenshot). > The scheduler should delete all DAGs from the DAG table with files that no > longer exist when it scans the DAGs folder for files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-5523) DAGs never get cleaned up from webservers if DAG file is removed from the scheduler
[ https://issues.apache.org/jira/browse/AIRFLOW-5523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942070#comment-16942070 ] Dan Davydov commented on AIRFLOW-5523: -- My concern with that would be that this could filter out new dags, especially ones that the user only intends to trigger manually. Maybe we can have some heuristic where if the DAG was added X days ago and still has not run we hide it, but I'm not sure that it's necessary. I think eventually along your lines of thinking it would be nice to have a notification sent out about DAGs that have not run in a long time for clean-up/scheduler resources purposes, or maybe even going a step further and being able to automatically expiring and hiding these DAGS, although that might be better suited as a plugin or something like that. > DAGs never get cleaned up from webservers if DAG file is removed from the > scheduler > --- > > Key: AIRFLOW-5523 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5523 > Project: Apache Airflow > Issue Type: Bug > Components: webserver >Affects Versions: 1.10.5 >Reporter: Dan Davydov >Priority: Major > Attachments: Screen Shot 2019-09-19 at 4.39.24 PM.png > > > DAGs are never removed from the DB if the DAG file used to generate the DAGs > is removed. This causes the webserver to accumulate broken dag rows (see > attached screenshot). > The scheduler should delete all DAGs from the DAG table with files that no > longer exist when it scans the DAGs folder for files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-5523) DAGs never get cleaned up from webservers if DAG file is removed from the scheduler
[ https://issues.apache.org/jira/browse/AIRFLOW-5523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936011#comment-16936011 ] Ash Berlin-Taylor commented on AIRFLOW-5523: Yes to tidying errors, but probably extend the "should delete DAGs that no longer exist" to filter out dags that have ever been run. > DAGs never get cleaned up from webservers if DAG file is removed from the > scheduler > --- > > Key: AIRFLOW-5523 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5523 > Project: Apache Airflow > Issue Type: Bug > Components: webserver >Affects Versions: 1.10.5 >Reporter: Dan Davydov >Priority: Major > Attachments: Screen Shot 2019-09-19 at 4.39.24 PM.png > > > DAGs are never removed from the DB if the DAG file used to generate the DAGs > is removed. This causes the webserver to accumulate broken dag rows (see > attached screenshot). > The scheduler should delete all DAGs from the DAG table with files that no > longer exist when it scans the DAGs folder for files. -- This message was sent by Atlassian Jira (v8.3.4#803005)