AnumSheraz opened a new issue, #23890:
URL: https://github.com/apache/airflow/issues/23890
### Apache Airflow version
2.1.4
### What happened
We dynamically generate dags based on directiry names in a specific path,
using following code
```
dag_generator.py
analytics_dag_path = Path(DAG_SCRIPTS_PATH, 'active')
globals()['analytics_dag'] = create_dag(
dag_id='analytics_dag',
dag_scripts_path=analytics_dag_path,
hourly=False
)
```
Under this directory we have one directory named `analytics_dag`. This was
running fine. But due to its huge size, we decided to split this dag into small
dags. We now have a top directory called `daily` under which we have multiple
directories each being served as small dags;
```
dag_generator.py
daily_paths = (path for path in Path(DAG_SCRIPTS_PATH, 'daily').iterdir() if
path.is_dir())
daily_config = global_config.get('daily', {})
for path in daily_paths:
globals()[path.name] = create_dag(
dag_id=path.name,
dag_scripts_path=path,
dag_config=daily_config.get(path.name, {})
)
```
**NOTE**: (the old directory `active/` is completey removed!)
All nicely done. we can now see multiple dags with directory names under
path daily/.
**PROBLEM:**
The old dag analytics_dag is still visible in web UI. I cannot delete this
DAG from webUI, it returns circles 404 - not found. Thats true, because when I
run command `airflow dags list` I do not see this dag in the list. BUT the dag
wasn't removed from our RDS database;
```
mysql> select dag_id from dag;
+-------------------------+
| dag_id |
+-------------------------+
| analytics_dag |
| another_dag |
| all_other_dags |
| ... |
+-------------------------+
16 rows in set (0.00 sec)
```
Now i cannot simple delete this record from dag table directly, because I
know there are other things associated with this table.
I think this has happened because of the fact the file location `fileloc` of
both new dags and old dag is same `dag_generator.py` and airflow couldn't able
to figure out which one is staled??
### What you think should happen instead
While airflow creates the dags dynamically (as shown in dag_generator.py
script), it should be able to figure out which dags are successfully created
and which dags are staled, and finally removed the staled dags from dataabse
too.
### How to reproduce
1. create dags dynamically based on directory names (as shown in
dag_generator.py code).
2. once loaded into airflow successfully, remove any one of the directory.
3. The dag with that removed directory name will still be visible in webUI
and database, but not in `airflow dags list`. And you won't be able to remove
it.
### Operating System
Debian GNU/Linux 10 (buster)
### Versions of Apache Airflow Providers
apache-airflow-providers-amazon==2.2.0
apache-airflow-providers-celery==2.0.0
apache-airflow-providers-cncf-kubernetes==2.0.2
apache-airflow-providers-docker==2.1.1
apache-airflow-providers-elasticsearch==2.0.3
apache-airflow-providers-ftp==2.0.1
apache-airflow-providers-http==2.0.1
apache-airflow-providers-imap==2.0.1
apache-airflow-providers-mysql==2.1.1
apache-airflow-providers-postgres==2.2.0
apache-airflow-providers-sftp==2.1.1
apache-airflow-providers-slack==4.0.1
apache-airflow-providers-sqlite==2.0.1
apache-airflow-providers-ssh==2.1.1
### Deployment
Other
### Deployment details
We are running airflow on Kubernetes with our own helm charts.
### Anything else
```
mysql> SHOW variables LIKE '%version%';
+-------------------------+------------------------------+
| Variable_name | Value |
+-------------------------+------------------------------+
| aurora_version | 2.02.5 |
| innodb_version | 5.7.12 |
| protocol_version | 10 |
| slave_type_conversions | |
| tls_version | TLSv1,TLSv1.1,TLSv1.2 |
| version | 5.7.12 |
| version_comment | MySQL Community Server (GPL) |
| version_compile_machine | x86_64 |
| version_compile_os | Linux |
+-------------------------+------------------------------+
```
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]