1fanwang opened a new pull request, #66820: URL: https://github.com/apache/airflow/pull/66820
### Problem `SchedulerJobRunner._do_scheduling()` runs in two phases against the same session. Phase 1 calls `_start_queued_dagruns()` and `guard.commit()`; phase 2 fetches running dag runs and calls `_schedule_all_dag_runs()`. After phase 1 commits, the `DagRun` objects that phase 1 loaded are still in the session's identity map. When phase 2's `_schedule_all_dag_runs()` triggers a flush or merge, those leftover instances can be re-dirtied and end up in the final `guard.commit()`. Under HA scheduler deployments with several active replicas processing different dag runs, that means each replica's final commit touches not only the rows it intends to update but also a tail of stale rows in an order driven by phase 1, which is not the order other replicas are taking for their own work. The result is A-B / B-A deadlocks on the `(dag_run, task_instance)` lock pair — `1213 "Deadlock found when trying to get lock"` on MySQL, `deadlock detected` on PostgreSQL — and the loop slows down under contention. ### Fix Add a single `session.expunge_all()` immediately after the phase 1 `guard.commit()` and before phase 2's `DagRun.get_running_dag_runs_to_examine(...)`. Phase 2 then reloads its working set fresh, and the final commit touches only the rows phase 2 intentionally pulled in. The outer `session.expunge_all()` already in place later in `_do_scheduling()` does the same thing globally; this one closes the gap between phases. ### Tests Added a unit test that patches `_start_queued_dagruns` to seed the identity map with a known dag run, patches `DagRun.get_running_dag_runs_to_examine` to capture the identity map keys at the start of phase 2, and asserts the captured set is empty. Closes #66817 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
