dstaple commented on issue #23361:
URL: https://github.com/apache/airflow/issues/23361#issuecomment-1183499266

   Regarding how the deadlock happens, as we know, we have two statements: 
UPDATE and SELECT ... FOR UPDATE that cause a deadlock. As described 
previously, the UPDATE statement is here 
https://github.com/apache/airflow/blob/2.2.5/airflow/models/dagrun.py#L903-L909.
 It needs to get a shared lock on matching rows of `task_instance` and 
`dag_run` tables in order to execute the update. However, it fails to do that. 
Before executing the `schedule_tis` function, the scheduler job already gets 
the lock in the `dag_run` table:
   
https://github.com/apache/airflow/blob/2.2.5/airflow/jobs/scheduler_job.py#L884
   https://github.com/apache/airflow/blob/2.2.5/airflow/models/dagrun.py#L287
   So it means the UPDATE statement is failing at acquiring the matching row 
lock of the `task_instance` table.
   
   In the meantime, the SELECT ... FOR UPDATE statement `select ... from 
task_instance join dag_run ... for update` also needs a lock of the matching 
rows in both task_instance and dag_run tables. It first acquires such a lock of 
the `task_instance` table, but then attempts to get the lock of the `dag_run` 
table, which was already acquired by the UPDATE statement. Therefore, a 
deadlock happens.
   
   Regarding how this might be fixed, the suspicion is 
`TaskInstance.refresh_from_db()` only needs to lock the rows in the 
task_instance table, not the dag_run table. This implies the deadlocks might be 
worked around by replacing `with_for_update()` with 
`with_for_update(of=TaskInstance)` here 
https://github.com/apache/airflow/blob/2.2.5/airflow/models/taskinstance.py#L736,
 at least for PostgreSQL and Oracle.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to