Nataneljpwd commented on PR #64109:
URL: https://github.com/apache/airflow/pull/64109#issuecomment-4112414986

   > I'm struggling to understand how the problem has been solved from reading 
the code. Can you explain your solution for preventing the starvation?
   
   Sure,
   Instead of (as of now) querying N first runs, and then filtering on the max 
active runs, we query the first N runs where we (in SQL) check the the max 
active runs (before the limit is applied)
   And so we skip a lot of runs which cannot be scheduled
   
   Assume dags a, b
   a - 3 max active runs
   b - no limit (default to 16 from config)
   If now the query result looked like so (small letter is schedulable, capital 
letter is schedulable according to ) where each row represents a run (the - 
determine the limit, all runs before the - are selected, all other are ignored) 
where the max dagruns to schedule per loop (the limit) is 5
   
   A
   A
   A
   a
   a
   -
   B
   B
   B
   
   Here (as of now) the last 3 dagruns are ommitted and ignored (starving runs 
from b)
   
   After the change it will look like so:
   
   A
   A
   A
   B
   B
   -
   B
   
   Now we do schedule everything we queried without dagruns from a limiting us 
(the limit now becomes the max dagruns per loop to schedule configuration) and 
it is guaranteed that the runs queried will be able to run
   
   Hope this explained it, if anything is not clear feel free to let me know, I 
will write a better explanation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to