tanvn opened a new issue, #28051:
URL: https://github.com/apache/airflow/issues/28051
### Apache Airflow version
2.4.3
### What happened
Tried executing the following command to clean old data in my service:
```
airflow db clean --skip-archive --clean-before-timestamp '2022-10-31
00:00:00+0000' --tables 'xcom, log, dag_run, task_instance' --verbose --yes
```
Then got the following error
```
...
[2022-12-02T08:34:18.065+0000] {db_cleanup.py:138} DEBUG - ctas query:
CREATE TABLE _airflow_deleted__dag_run__20221202083418 AS SELECT base.*
FROM dag_run AS base LEFT OUTER JOIN (SELECT dag_id, max(dag_run.start_date)
AS max_date_per_group
FROM dag_run
WHERE external_trigger = false GROUP BY dag_id) AS latest ON base.dag_id =
latest.dag_id AND base.start_date = max_date_per_group
WHERE base.start_date < :start_date_1 AND max_date_per_group IS NULL
[2022-12-02T08:34:18.069+0000] {cli_action_loggers.py:83} DEBUG - Calling
callbacks: []
Traceback (most recent call last):
File
"/opt/app-root/lib64/python3.8/site-packages/mysql/connector/connection_cext.py",
line 565, in cmd_query
self._cmysql.query(
_mysql_connector.MySQLInterfaceError: Statement violates GTID consistency:
CREATE TABLE ... SELECT.The above exception was the direct cause of the
following exception:Traceback (most recent call last):
File
"/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line
1802, in _execute_context
self.dialect.do_execute(
File
"/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/default.py",
line 719, in do_execute
cursor.execute(statement, parameters)
File
"/opt/app-root/lib64/python3.8/site-packages/mysql/connector/cursor_cext.py",
line 279, in execute
result = self._cnx.cmd_query(
File
"/opt/app-root/lib64/python3.8/site-packages/mysql/connector/connection_cext.py",
line 573, in cmd_query
raise get_mysql_exception(
mysql.connector.errors.DatabaseError: 1786 (HY000): Statement violates GTID
consistency: CREATE TABLE ... SELECT.The above exception was the direct cause
of the following exception:Traceback (most recent call last):
...
```
our MySQL server has `enforce_gtid_consistency` parameter set to ON and it
does not allow `CREATE TABLE ... SELECT` statement.
### What you think should happen instead
I expected that the command should work for MySQL with
enforce_gtid_consistency enabled.
### How to reproduce
As described above.
### Operating System
K8S with base image is CentOS
### Versions of Apache Airflow Providers
_No response_
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
_No response_
### Anything else
Can we avoid this kind of issue by breaking the statement into 2, like:
```
CREATE TABLE _airflow_deleted__dag_run__20221202083418 LIKE dag_run;
INSERT _airflow_deleted__dag_run__20221202083418 SELECT * FROM SELECT base.*
FROM dag_run AS base LEFT OUTER JOIN (SELECT dag_id, max(dag_run.start_date)
AS max_date_per_group
FROM dag_run
WHERE external_trigger = false GROUP BY dag_id) AS latest ON base.dag_id =
latest.dag_id AND base.start_date = max_date_per_group
WHERE base.start_date < :start_date_1 AND max_date_per_group IS NULL;
```
Ref: https://stackoverflow.com/a/56068655/1526790
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]