Asquator commented on issue #49508:
URL: https://github.com/apache/airflow/issues/49508#issuecomment-2844711646

   Looks convincing. According to OP, the second DAG's runs are stuck at 
_queued_ and not getting scheduled, so it shouldn't be related to 
`get_running_dag_runs_to_examine` or pools or any issues with the critical 
section, because _queued -> running_ transition for DAG runs happens much 
earlier than we get to any scheduling decisions for tasks (like creating TIs or 
scheduling them in CS). The whole logic for transitioning DAG runs to _running_ 
is in `SchedulerJobRunner._start_queued_dagruns` and 
`DagRun.get_queued_dag_runs_to_set_running`, and the described behaviour is a 
consequence of the optimistic scheduling strategy used across Airflow not being 
able to handle a variety of edge cases. 
   
   We first pull at most `DEFAULT_DAGRUNS_TO_EXAMINE` DAG runs whose DAGs are 
not yet maxed out:
   
https://github.com/apache/airflow/blob/e4957ff3827e0aea0465026023dd58288c5b1299/airflow-core/src/airflow/models/dagrun.py#L602C13-L611C14
   
   And then go over each one of them increasing the currently scheduled DAGrun 
count and dropping once we can't stuff any more runs:
   
https://github.com/apache/airflow/blob/65be581ac7715d3765af4a3b99faea26e5da55fc/airflow-core/src/airflow/jobs/scheduler_job_runner.py#L1740C9-L1740C33
   
https://github.com/apache/airflow/blob/65be581ac7715d3765af4a3b99faea26e5da55fc/airflow-core/src/airflow/jobs/scheduler_job_runner.py#L1762C13-L1773C29
   
   
   It's indeed the same as https://github.com/apache/airflow/issues/45636, but 
now with DAG runs instead of tasks. If the original issue can be efficiently 
solved with windowing, I guess this one can also benefit from it, like:
   
   1. Window partitioned by DAG
   2. For each DAG compute the number of runs
   3. Stuff new runs as far as there are free slots (per DAG)
   4. Put a limit on the result (which can be far less restrictive, as we 
delegate more work to SQL and the cost of a query won't go up as fast as 
throughput will).
   
   As windowing may introduce additional overhead which sometimes won't result 
in higher throughput, assuming there aren't so many DAG runs in the system, I'm 
looking for a compromise between the optimistic and the windowing (pessimistic) 
approaches. It's possible to have it configurable, though it will make the 
system harder and harder to tweak to everyone's needs, which is not desirable. 
I hope the windowing can be done in such a way that will be negligible for the 
query time, by the help of correct indexing, or completely changing some core 
concurrency parameters, like introducing `DEFAULT_DAGS_TO_EXAMINE` in favor of 
`DEFAULT_DAGRUNS_TO_EXAMINE`, where the new parameter limits the number of 
windows in the query. Some research is needed to find out. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to