neptune19821220 opened a new issue #20016: URL: https://github.com/apache/airflow/issues/20016
### Apache Airflow version 2.2.2 (latest released) ### Operating System docker image apache/airflow:2.2.2-python3.9 ### Versions of Apache Airflow Providers _No response_ ### Deployment Other Docker-based deployment ### Deployment details I deployed airflow in AWS ECS Fargate Cluster. I use Aurora Serverless mysql5.7 as airflow db backend and AWS Redis cluster as AIRFLOW__CELERY__BROKER_URL. I also noticed the db backend charset setting, so I config the db parameter: character_set_client: utf8mb4 character_set_connection: utf8mb4 character_set_database: utf8mb4 character_set_server: utf8mb4 collation_connection: utf8mb4_unicode_ci collation_server: utf8mb4_unicode_ci ### What happened I use apache/airflow:2.1.3-python3.8 image and everything works well. Yesterday I use apache/airflow:2.2.2-python3.9 to upgrade my cluster to latest version. When the init db service run airflow db upgrade, it throw exception sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (1032, "Can't find record in 'task_instance'") [SQL: UPDATE task_instance, dag_run SET task_instance.run_id=dag_run.run_id WHERE dag_run.dag_id = task_instance.dag_id AND dag_run.execution_date = task_instance.execution_date] I find it may be stopped at the line 187 in the db migration file https://github.com/apache/airflow/blob/5dd690b57a20ca944deb8d96e89ec6ae6161afeb/airflow/migrations/versions/7b2661a43ba3_taskinstance_keyed_to_dagrun.py. And I even try to run this sql from mysql client, it still throw such error. Because the upgrade process has finished some step, for example, drop dag_id and dag_id_2, it's impossible to re-run airflow db upgrade to try again. May I know how to get rid of this state except restoring db from my backup? ### What you expected to happen The db upgrade process should be successful. ### How to reproduce Create the ECS Fargate cluster with airflow2.1.3-python3.8 according the file https://airflow.apache.org/docs/apache-airflow/2.1.3/docker-compose.yaml. Use Aurora Serverless mysql5.7 as airflow db backend and AWS Redis cluster as AIRFLOW__CELERY__BROKER_URL. Then replace airflow image with airflow2.2.2-python3.9 and update the template. I am not sure if this issue can be reproduced. Because I have three environments which are created by the same cloduformation template, the infra should be the same. I am not sure if it is related with the records number in table. In develop environment, there are about 3k records in task_instance table and about 300 records in dag_run table. In staging environment, there are about 13k records in task_instance table and about 1300 records in dag_run table. In production environment, there are about 13k records in task_instance table and about 1300 records in dag_run table. The upgrade process was successfully finished in develop environment. But it always failed in staging and production environments. ### Anything else 1. If the db upgrade process failed, is it possible to roll back all previous changes during the upgrade? 2. Is it possible to give one sql syntax file which do the same thing with the db migration file? When I am stuck in some step, I can try to update the db manually and then run the following sql directly. ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
