Re: [I] DAG Runs from a single DAG can prevent scheduler from seeing other DAG's runs [airflow]

via GitHub Thu, 05 Jun 2025 10:22:18 -0700


dstandish commented on issue #49508:
URL: https://github.com/apache/airflow/issues/49508#issuecomment-2945364573


   Yeah Collin I think we need clarification on the repro scenario / under what 
conditions starvation occurs.  
   
   Let me go through your post carefully
   
   > Two DAGs each receive a large batch of DAG Runs. The number of runs for 
each DAG exceeds max_dagruns_per_loop_to_schedule. 
   What state are the "received" in?  I assume you mean triggered via API?
   > Each DAG run is very short, shorter than the heartrate of this Airflow 
deployment. 
   What is the signifigance of the runs being very short? What do you think 
that has to do with this?
   > Both DAGs have a max_active_runs that is far less than dagruns_per_loop.
   Why?  So that we can be confident that the scheduler _should_ fetch some of 
these runs in the query?
   > So: max_active_runs < max_dagruns_per_loop_to_schedule < number of queued 
DAG runs.
   > Each scheduler loop, there are a very small number of DAG Run "slots" for 
the first DAG, so the check coalesce(running_drs.c.num_running, text("0")) < 
coalesce(Backfill.max_active_runs, DagModel.max_active_runs), does not apply. 
But then all the DAG runs that are considered are from the first DAG. So Second 
DAG effectively has to wait for nearly all of First DAG's runs to complete 
before any of its runs are moved from queued to running.
   
   Could it be that what you were observing had to do with last scheduling 
decision not getting updated because tasks were backed up somehow?
   
   I guess either way, this seems like a very rare edge case scenario and, if 
two people tried and failed it, not sure it's worth contiuning to try without 
new information.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] DAG Runs from a single DAG can prevent scheduler from seeing other DAG's runs [airflow]

Reply via email to