capytan opened a new pull request, #66762:
URL: https://github.com/apache/airflow/pull/66762

   related: #66734
   
   After upgrading to 3.2.1 we hit periodic Postgres deadlocks on 
`task_instance` between two writers that lock the same rows in different orders:
   
   - scheduler's `_verify_integrity_if_dag_changed` issues `UPDATE 
task_instance ... WHERE dag_id=? AND run_id=? AND state IN (...)`, which the 
planner satisfies via the `(dag_id, run_id)` index
   - api-server's `ti_update_state` does `SELECT FOR UPDATE` on the same rows 
by primary key
   
   When the row sets overlap, PG aborts one of them with `deadlock detected`. 
Traceback and pg_stat_activity output are in #66734.
   
   This PR replaces the bulk UPDATE with a `SELECT id ORDER BY id LIMIT N FOR 
UPDATE SKIP LOCKED` + `UPDATE WHERE id IN (...)` batch loop, so the scheduler 
also locks in PK order. Same approach #65920 takes for 
`check_trigger_timeouts`; #65818 tracks this family of bugs.
   
   One thing that bit me on the way: the inner UPDATE only changes 
`dag_version_id`, not `state`, so a SELECT filtered solely on `state IN 
(State.unfinished)` keeps returning the same N ids on every iteration and the 
loop never terminates once the batch is full. The candidate SELECT also filters 
on `dag_version_id IS NULL OR != latest_dag_version.id` so each batch advances. 
The previous single bulk UPDATE never surfaced this; the batched form needs it.
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [X] Yes — Claude Opus 4.7 (1M context)
   
   Generated-by: Claude Opus 4.7 (1M context) following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to