jason810496 opened a new pull request, #63920: URL: https://github.com/apache/airflow/pull/63920
* closes: https://github.com/apache/airflow/issues/63549 ## Why > Migration 0101_3_2_0_ui_improvements_for_deadlines upgrade is slow on deployments with large deadline and serialized_dag tables. With 10M deadline rows and 100K serialized_dags, the migration took ~16 minutes. As mentioned in the issue ## What There're several critical paths to improve 1. We're looping through `serialized_dag.dag_id` (**column without index**) before, now we're iterating through `serialized_dag.id` as batch processing partition key instead 2. Add temporary index for `serialized_dag.dagrun_id` just for migration as we couldn't avoid iterating `serialized_dag.dagrun_id` column. 3. Avoid ser/deser compression for `dag_data` 4. Build dialect-specific filter to skip rows without deadline data at the SQL level 5. Fetch dagrun IDs once per DAG to avoid repeating the expensive dag_run/serialized_dag JOIN for every alert. --- ##### Was generative AI tooling used to co-author this PR? - [x] Yes (please specify the tool below) Claude Code for discussing and drafting -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
