rjh-yext opened a new issue, #59474:
URL: https://github.com/apache/airflow/issues/59474

   ### Apache Airflow version
   
   Other Airflow 3 version (please specify below)
   
   ### If "Other Airflow 3 version" selected, which one?
   
   apache/airflow:3.1.3 Docker image
   
   ### What happened?
   
   When using the `airflow db clean` command, an error occurs when deleting 
rows from the `dag_version` table. This is caused by a violation of the 
`task_instance_dag_version_id_fkey` constraint added by revision id 
`3ac9e5732b1f`, see error message below.
   
   What appears to be happening is that there can be rows in the 
`task_instance` table that are not being deleted because they do not fall 
within the age limit criteria, that reference rows in the `dag_version` table 
that are being deleted because they do fall within the age limit criteria. 
   
   The `db clean` command is removing rows from both the `task_instance` and 
`dag_version` table based solely on whether they fit the age limit criteria, 
and does not take into consideration any fk constraints. I believe this means:
   - `task_instance` uses columns `start_date` or `updated_at`
   - `dag_version` uses `last_updated` or `created_at`
   
   Error message:
   
   `sqlalchemy.exc.IntegrityError: (MySQLdb.IntegrityError) (1451, 'Cannot 
delete or update a parent row: a foreign key constraint fails 
(`airflow`.`task_instance`, CONSTRAINT `task_instance_dag_version_id_fkey` 
FOREIGN KEY (`dag_version_id`) REFERENCES `dag_version` (`id`) ON DELETE 
RESTRICT)')
   06:02:09   [SQL: DELETE FROM dag_version USING dag_version, 
_airflow_deleted__dag_version__20251215060205 WHERE dag_version.id = 
_airflow_deleted__dag_version__20251215060205.id]`
   
   
   ### What you think should happen instead?
   
   _No response_
   
   ### How to reproduce
   
   Have a dag that has not been updated for a period of time, but is run 
recently. 
   
   Run `airflow db clean` with a date before the recent dag run, but after the 
date the dag was last updated. We should expect the `dag_version` row to be 
deleted, but the `task_instance` row to be kept.
   
   ### Operating System
   
   Debian GNU/Linux 12 (bookworm)
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to