The GitHub Actions job "Tests" on airflow.git has succeeded. Run started by GitHub user potiuk (triggered by potiuk).
Head commit for run: 12df02bf4be0f4424072d253224efa3ffdab2c8f / Jarek Potiuk <[email protected]> Fix Deadlock on refresh from DB by local task run This PR attempts to fix the deadlock that occurs when task instance is being run in parallel to running _do_scheduling operation executing get_next_dagruns_to_examine. The whole scheduling is based on actually locking DagRuns scheduler operats on - and it basically means that state of ANY task instances for that DagRun should not change during the scheduling. However there are some cases where task instance is locked FOR UPDATE without prior locking of the DagRun table - this happens for example when local task job executes the task and runs "check_and_change_state_before_execution" method on the task instance it runs. There is no earlier DagRun locking happening and the "refresh_from_db" run with lock_for_update will get the lock on both TaskInstance row as well as on the DagRun row. The problem is this locking happens in reverse sequence in this case: 1) get_next_dagruns_to_examine - locks DagRun first and THEN tries to locks some task instances for that DagRun 2) "check_and_change_state_before_execution" runs effectively the query: select ... from task_instance join dag_run ... for update which FIRST locks TaskInstance and then DagRun table. This reverse sequence of locking is what causes the deadlock. The fix is to force locking the DagRun before running the task instance query that joins dag_run to task_instance. Fixes: #23361 Report URL: https://github.com/apache/airflow/actions/runs/2731610635 With regards, GitHub Actions via GitBox --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
