ColtenOuO opened a new pull request, #67304:
URL: https://github.com/apache/airflow/pull/67304

   `BulkTaskInstanceService.handle_bulk_delete` re-queried the database once 
per task instance inside the delete loop, even though 
`_categorize_task_instances` had already loaded every matched task instance 
into `task_instances_map`.
   
   The "all map index" branch of the same method already batches its lookup 
correctly — only the "specific `(task_id, map_index)`" branch had the N+1.
   
   ## Database round trips (N = task instances deleted)
   
   Each redundant `SELECT` is one extra DB round trip. The fix reuses the 
in-memory `task_instances_map`, so those N re-fetch round trips are removed:
   
   | N (task instances) | Round trips before | Round trips after |
   |--------------------|--------------------|-------------------|
   | 5                  | 15                 | 10                |
   | 10                 | 25                 | 15                |
   | 20                 | 45                 | 25                |
   
   Before: `5 + 2N` queries — N of them redundant re-fetches.
   After:  `5 + N` queries — exactly **N round trips eliminated** per bulk 
delete.
   
   ## Fix
   
   Capture the `task_instances_map` returned by `_categorize_task_instances` 
(previously discarded with `_`) and look each task instance up in it instead of 
issuing a fresh `SELECT` in the loop. Every key in `matched_task_keys` is a key 
of that map, so the lookup is guaranteed to hit.
   
   ## Validation plan
   
   1. Added `test_bulk_delete_does_not_requery_each_task_instance`: deletes two
      differently-sized batches (5 and 15) and asserts each extra task instance
      adds exactly one query — the removed re-query would roughly double the 
delta.
   2. Confirmed the test **fails on the pre-fix code** (query delta 20 instead 
of 10, i.e. doubled) and **passes with the fix**, so it genuinely guards the 
regression.
   3. Ran the full `TestBulkTaskInstances` suite (27 tests) — all pass; bulk 
delete / update / wildcard / authorization behaviour is unchanged.
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [X] Yes — Claude Code (Opus 4.7)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to