The GitHub Actions job "Tests" on airflow.git has succeeded.
Run started by GitHub user kaxil (triggered by kaxil).

Head commit for run:
92fc37bd7ad503f0197028cc6af2054c6c697137 / Kaxil Naik <[email protected]>
Migrate `TaskInstance` to UUID v7 primary key

closes https://github.com/apache/airflow/issues/43161 part of 
[AIP-72](https://github.com/orgs/apache/projects/405).

As part of the ongoing work for [AIP-72: Task Execution 
Interface](https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-72+Task+Execution+Interface+aka+Task+SDK),
 we are migrating the task_instance table to use a UUID primary key. This 
change is being made to simplify task instance identification, especially when 
communicating between the executor and workers.

Currently, the primary key of task_instance is a composite key consisting of 
`dag_id, task_id, run_id, and map_index` as shown below. This migration 
introduces a **UUID v7** column (`id`) as the new primary key.

https://github.com/apache/airflow/blob/b4269f33c7151e6d61e07333003ec1e219285b07/airflow/models/taskinstance.py#L1815-L1819

The UUID v7 format was chosen because of its improved temporal sorting 
capabilities. For existing records, UUID v7 will be generated using either the 
queued_dttm, start_date, or the current timestamp.

<img width="792" alt="image" 
src="https://github.com/user-attachments/assets/ba68c9ae-4f9d-4cd2-8504-1b671d23ef6c";>

(From [this blog 
post](https://www.toomanyafterthoughts.com/uuids-are-bad-for-database-index-performance-uuid7).)

1. **Migrated Primary Key to UUID v7**
   - Replaced the composite primary key (`dag_id`, `task_id`, `run_id`, 
`map_index`) with a UUID v7 `id` field, ensuring temporal sorting and 
simplified task instance identification.

2. **Database-Specific UUID v7 Functions**
   - Added UUID v7 functions for each database:
      - **PostgreSQL**: Uses `pgcrypto` for generation with fallback.
      - **MySQL**: Custom deterministic UUID v7 function.
      - **SQLite**: Utilizes `uuid6` Python package.

3. **Updated Constraints and Indexes**
   - Added `UniqueConstraint` on (`dag_id`, `task_id`, `run_id`, `map_index`) 
for compatibility.
   - Modified foreign key constraints for the new primary key, handling 
downgrades to restore previous constraints.

4. **Model and API Adjustments**
   - Updated `TaskInstance` model to use UUID v7 as the primary key via 
[`uuid6`](https://pypi.org/project/uuid6/) library, that has uuid7 ! 😄 .
   - Adjusted REST API, views, and queries to support UUID-based lookups.
   - Modified tests for compatibility with the new primary key.

Report URL: https://github.com/apache/airflow/actions/runs/11545154181

With regards,
GitHub Actions via GitBox


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to