Lee2532 opened a new issue, #39784:
URL: https://github.com/apache/airflow/issues/39784

   ### Description
   
   When using MySQL as a metadata database, while transferring thousands of S3 
data files to GCS, the files are moved successfully, but when recording the 
paths of these files in the metadata (XCom), the length of the list of paths 
exceeds the MySQL BLOB length limit. 
   
   To resolve this issue, it would be helpful to have a feature to compress 
this list when not using the XCom data.
   
   ### Use case/motivation
   
   [2024-05-06, 15:31:02 KST] {s3_to_gcs.py:262} INFO - All done, uploaded 1330 
files to Google Cloud Storage
   [2024-05-06, 15:31:02 KST] {taskinstance.py:1935} ERROR - Task failed with 
exception
   Traceback (most recent call last):
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", 
line 1910, in _execute_context
   self.dialect.do_execute(
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/default.py",
 line 736, in do_execute
   cursor.execute(statement, parameters)
   File "/home/airflow/.local/lib/python3.9/site-packages/MySQLdb/cursors.py", 
line 179, in execute
   res = self._query(mogrified_query)
   File "/home/airflow/.local/lib/python3.9/site-packages/MySQLdb/cursors.py", 
line 330, in _query
   db.query(q)
   File 
"/home/airflow/.local/lib/python3.9/site-packages/MySQLdb/connections.py", line 
255, in query
   _mysql.connection.query(self, query)
   MySQLdb.DataError: (1406, "Data too long for column 'value' at row 1")
   The above exception was the direct cause of the following exception:
   Traceback (most recent call last):
   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/session.py", 
line 74, in wrapper
   return func(*args, **kwargs)
   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py",
 line 2477, in xcom_push
   XCom.set(
   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/session.py", 
line 74, in wrapper
   return func(*args, **kwargs)
   File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/models/xcom.py", line 
273, in set
   session.flush()
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", 
line 3449, in flush
   self._flush(objects)
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", 
line 3589, in flush
   transaction.rollback(capture_exception=True)
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py",
 line 70, in exit
   compat.raise(
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", 
line 211, in raise
   raise exception
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", 
line 3549, in _flush
   flush_context.execute()
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py",
 line 456, in execute
   rec.execute(self)
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py",
 line 630, in execute
   util.preloaded.orm_persistence.save_obj(
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/persistence.py",
 line 245, in save_obj
   _emit_insert_statements(
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/persistence.py",
 line 1097, in _emit_insert_statements
   c = connection._execute_20(
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", 
line 1710, in _execute_20
   return meth(self, args_10style, kwargs_10style, execution_options)
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", 
line 334, in _execute_on_connection
   return connection._execute_clauseelement(
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", 
line 1577, in _execute_clauseelement
   ret = self._execute_context(
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", 
line 1953, in _execute_context
   self.handle_dbapi_exception(
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", 
line 2134, in handle_dbapi_exception
   util.raise(
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", 
line 211, in raise
   raise exception
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", 
line 1910, in _execute_context
   self.dialect.do_execute(
   File 
"/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/default.py",
 line 736, in do_execute
   cursor.execute(statement, parameters)
   File "/home/airflow/.local/lib/python3.9/site-packages/MySQLdb/cursors.py", 
line 179, in execute
   res = self._query(mogrified_query)
   File "/home/airflow/.local/lib/python3.9/site-packages/MySQLdb/cursors.py", 
line 330, in _query
   db.query(q)
   File 
"/home/airflow/.local/lib/python3.9/site-packages/MySQLdb/connections.py", line 
255, in query
   _mysql.connection.query(self, query)
   sqlalchemy.exc.DataError: (MySQLdb.DataError) (1406, "Data too long for 
column 'value' at row 1")
   [SQL: INSERT INTO xcom (dag_run_id, task_id, map_index, key, dag_id, run_id, 
value, timestamp) VALUES (%s, %s, %s, %s, %s, %s, %s, %s)]
   
   .....
   
   (Background on this error at: https://sqlalche.me/e/14/9h9h); 17)
   [2024-05-06, 15:31:02 KST] {local_task_job_runner.py:228} INFO - Task exited 
with return code 1
   [2024-05-06, 15:31:02 KST] {taskinstance.py:2776} INFO - 0 downstream tasks 
scheduled from follow-on schedule check
   
   ### Related issues
   
   https://github.com/apache/airflow/issues/39708
   https://github.com/apache/airflow/discussions/39715
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to