kaxil commented on issue #56647:
URL: https://github.com/apache/airflow/issues/56647#issuecomment-3409103302
This is a database migration issue caused by invalid UTF-8 data in your XCom
table. The migration attempts to convert `BYTEA` to `JSONB` but fails when
encountering corrupted Unicode sequences.
## Quick Fix
Run this query **before** migration to archive and remove problematic
records:
```sql
BEGIN;
-- Create archive table for corrupted data
CREATE TABLE IF NOT EXISTS _xcom_corrupted_archive (
dag_run_id INTEGER NOT NULL,
task_id VARCHAR(250) NOT NULL,
map_index INTEGER NOT NULL,
key VARCHAR(512) NOT NULL,
dag_id VARCHAR(250) NOT NULL,
run_id VARCHAR(250) NOT NULL,
value BYTEA,
timestamp TIMESTAMP NOT NULL,
archived_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
error_reason TEXT DEFAULT 'Invalid UTF-8 encoding',
PRIMARY KEY (dag_run_id, task_id, map_index, key)
);
-- Archive corrupted records (can't be converted to valid JSON anyway)
INSERT INTO _xcom_corrupted_archive (dag_run_id, task_id, map_index, key,
dag_id, run_id, value, timestamp)
SELECT dag_run_id, task_id, map_index, key, dag_id, run_id, value, timestamp
FROM xcom
WHERE value IS NOT NULL
AND get_byte(value, 0) != 128 -- Not pickled data
AND NOT (
-- Test UTF-8 validity
convert_from(value, 'UTF8') IS NOT NULL
)
ON CONFLICT DO NOTHING;
-- Delete corrupted records
DELETE FROM xcom
WHERE (dag_run_id, task_id, map_index, key) IN (
SELECT dag_run_id, task_id, map_index, key FROM _xcom_corrupted_archive
);
-- Show what was archived
SELECT COUNT(*) as archived_count FROM _xcom_corrupted_archive;
COMMIT;
```
After running this, retry `airflow db migrate`. The corrupted data is safely
archived in `_xcom_corrupted_archive` if you need to investigate later.
**Note:** Backup your database before running this!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]