The GitHub Actions job "Tests AMD" on airflow.git/fix/mysql-uuid-migration-malformed-ids has failed. Run started by GitHub user kaxil (triggered by kaxil).
Head commit for run: debe60afbed1c7f1dcde3ae01c0f7e9847bc509c / Kaxil Naik <[email protected]> Fix MySQL UUID generation in task_instance migration The MySQL UUID v7 generation function in migration 0042 was creating malformed UUIDs that fail Pydantic validation when the scheduler attempts to enqueue task instances. ## Problem The original MySQL function had two critical issues: 1. Generated only 16 random hex characters instead of the required 20 2. Used SUBSTRING(rand_hex, 9) without length limit, producing 8-character final segments instead of the required 12 characters This resulted in malformed UUIDs like: - Bad: 0198cf6d-fb98-4555-7301-e29b8403 (32 chars, last segment: 8 chars) - Good: 0198cf6d-fb98-4555-7301-e29b8403abcd (36 chars, last segment: 12 chars) ## When This Issue Occurs The validation error happens when: 1. Task instances exist in 'scheduled' state before migrating from 2.10 to 3.0.x 2. These tasks receive malformed UUIDs during migration 3. Scheduler tries to enqueue these tasks via ExecuteTask.make() 4. Pydantic validation fails: 'invalid group length in group 4: expected 12, found 8' Users with no scheduled tasks during migration or who create new DAG runs typically don't encounter this issue since new task instances get proper UUIDs from the Python uuid7() function. ## Solution Updated the MySQL uuid_generate_v7 function to: - Use RANDOM_BYTES(10) for cryptographically secure 20-character hex data - Apply explicit SUBSTRING(rand_hex, 9, 12) to ensure 12-character final segment - Mark function as NOT DETERMINISTIC (correct for random functions) - Use CHAR(20) declaration matching actual usage ## Why No Data Migration We decided against creating a separate migration to fix existing malformed UUIDs because: 1. **Limited scope** - Only affects task instances in 'scheduled' state during migration 2. **Self-healing** - System recovers as old tasks complete and new ones are created 3. **Risk mitigation** - Avoid complex primary key modifications in production 4. **Alternative available** - Manual fix script provided below for affected users 5. **Prevention focus** - Fixing root cause prevents future occurrences ## Manual Fix for Affected Users If you encounter the UUID validation error, you can fix existing malformed UUIDs: ```sql -- Fix malformed UUIDs by extending them to proper length UPDATE task_instance SET id = CONCAT( SUBSTRING(id, 1, 23), -- Keep first 23 chars (including last dash) LPAD(HEX(FLOOR(RAND() * POW(2,32))), 8, '0') -- Add 8 random hex chars ) WHERE LENGTH(SUBSTRING_INDEX(id, '-', -1)) = 8; -- Find 8-char final segments -- Verify the fix SELECT id, LENGTH(id) as uuid_length, LENGTH(SUBSTRING_INDEX(id, '-', -1)) as last_segment_length FROM task_instance WHERE LENGTH(SUBSTRING_INDEX(id, '-', -1)) != 12 LIMIT 5; ``` ## Testing Verified the fix generates valid UUIDs: - Format: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx (36 chars total) - Final segment: 12 characters (not 8) - Passes standard UUID validation patterns Fixes #54554 Report URL: https://github.com/apache/airflow/actions/runs/17144618336 With regards, GitHub Actions via GitBox --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
