tanelk commented on PR #21901:
URL: https://github.com/apache/airflow/pull/21901#issuecomment-1100813499

   I have some questions about this PR. Mainly how does it help with the 
"original" issue - deadlocking?
   
   These changes you propose might bring amazing performace improvements (hard 
to say without benchmarks) but it "feels" like a wrong way to solve a deadlock 
issue.  
   
   I'm now just spitballing without actually running the code.  The pair of 
methods `self._should_update_dag_next_dagruns` and 
`dag_model.calculate_dagrun_date_fields` are called in 3 places:
   
   Once when creating dag runs:
   
https://github.com/apache/airflow/blob/3a2eb961ca628d962ed5fad68760ef3439b2f40b/airflow/jobs/scheduler_job.py#L1038-L1039
   
   Twice when scheduling those dag runs:
   
https://github.com/apache/airflow/blob/3a2eb961ca628d962ed5fad68760ef3439b2f40b/airflow/jobs/scheduler_job.py#L1139-L1140
   
https://github.com/apache/airflow/blob/3a2eb961ca628d962ed5fad68760ef3439b2f40b/airflow/jobs/scheduler_job.py#L1165-L1166
   
   In the first case we have a lock on the `DagModel` from the 
`DagModel.dags_needing_dagruns` method. In the second two cases we have a lock 
on the `DagRun` from the `DagRun.next_dagruns_to_examine` method.
   
   Perhaps we should be locking also on the `DagModel`, because we are modifing 
it? It is probably best to bounce it by @ashb.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to