Lee2532 opened a new issue, #39784:
URL: https://github.com/apache/airflow/issues/39784
### Description
When using MySQL as a metadata database, while transferring thousands of S3
data files to GCS, the files are moved successfully, but when recording the
paths of these files in the metadata (XCom), the length of the list of paths
exceeds the MySQL BLOB length limit.
To resolve this issue, it would be helpful to have a feature to compress
this list when not using the XCom data.
### Use case/motivation
[2024-05-06, 15:31:02 KST] {s3_to_gcs.py:262} INFO - All done, uploaded 1330
files to Google Cloud Storage
[2024-05-06, 15:31:02 KST] {taskinstance.py:1935} ERROR - Task failed with
exception
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py",
line 1910, in _execute_context
self.dialect.do_execute(
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/default.py",
line 736, in do_execute
cursor.execute(statement, parameters)
File "/home/airflow/.local/lib/python3.9/site-packages/MySQLdb/cursors.py",
line 179, in execute
res = self._query(mogrified_query)
File "/home/airflow/.local/lib/python3.9/site-packages/MySQLdb/cursors.py",
line 330, in _query
db.query(q)
File
"/home/airflow/.local/lib/python3.9/site-packages/MySQLdb/connections.py", line
255, in query
_mysql.connection.query(self, query)
MySQLdb.DataError: (1406, "Data too long for column 'value' at row 1")
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/session.py",
line 74, in wrapper
return func(*args, **kwargs)
File
"/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py",
line 2477, in xcom_push
XCom.set(
File
"/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/session.py",
line 74, in wrapper
return func(*args, **kwargs)
File
"/home/airflow/.local/lib/python3.9/site-packages/airflow/models/xcom.py", line
273, in set
session.flush()
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/session.py",
line 3449, in flush
self._flush(objects)
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/session.py",
line 3589, in flush
transaction.rollback(capture_exception=True)
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py",
line 70, in exit
compat.raise(
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/util/compat.py",
line 211, in raise
raise exception
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/session.py",
line 3549, in _flush
flush_context.execute()
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py",
line 456, in execute
rec.execute(self)
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py",
line 630, in execute
util.preloaded.orm_persistence.save_obj(
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/persistence.py",
line 245, in save_obj
_emit_insert_statements(
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/persistence.py",
line 1097, in _emit_insert_statements
c = connection._execute_20(
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py",
line 1710, in _execute_20
return meth(self, args_10style, kwargs_10style, execution_options)
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py",
line 334, in _execute_on_connection
return connection._execute_clauseelement(
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py",
line 1577, in _execute_clauseelement
ret = self._execute_context(
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py",
line 1953, in _execute_context
self.handle_dbapi_exception(
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py",
line 2134, in handle_dbapi_exception
util.raise(
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/util/compat.py",
line 211, in raise
raise exception
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py",
line 1910, in _execute_context
self.dialect.do_execute(
File
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/default.py",
line 736, in do_execute
cursor.execute(statement, parameters)
File "/home/airflow/.local/lib/python3.9/site-packages/MySQLdb/cursors.py",
line 179, in execute
res = self._query(mogrified_query)
File "/home/airflow/.local/lib/python3.9/site-packages/MySQLdb/cursors.py",
line 330, in _query
db.query(q)
File
"/home/airflow/.local/lib/python3.9/site-packages/MySQLdb/connections.py", line
255, in query
_mysql.connection.query(self, query)
sqlalchemy.exc.DataError: (MySQLdb.DataError) (1406, "Data too long for
column 'value' at row 1")
[SQL: INSERT INTO xcom (dag_run_id, task_id, map_index, key, dag_id, run_id,
value, timestamp) VALUES (%s, %s, %s, %s, %s, %s, %s, %s)]
.....
(Background on this error at: https://sqlalche.me/e/14/9h9h); 17)
[2024-05-06, 15:31:02 KST] {local_task_job_runner.py:228} INFO - Task exited
with return code 1
[2024-05-06, 15:31:02 KST] {taskinstance.py:2776} INFO - 0 downstream tasks
scheduled from follow-on schedule check
### Related issues
https://github.com/apache/airflow/issues/39708
https://github.com/apache/airflow/discussions/39715
### Are you willing to submit a PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]