Asquator commented on issue #49508: URL: https://github.com/apache/airflow/issues/49508#issuecomment-2844711646
Looks convincing. According to OP, the second DAG's runs are stuck at _queued_ and not getting scheduled, so it shouldn't be related to `get_running_dag_runs_to_examine` or pools or any issues with the critical section, because _queued -> running_ transition for DAG runs happens much earlier than we get to any scheduling decisions for tasks (like creating TIs or scheduling them in CS). The whole logic for transitioning DAG runs to _running_ is in `SchedulerJobRunner._start_queued_dagruns` and `DagRun.get_queued_dag_runs_to_set_running`, and the described behaviour is a consequence of the optimistic scheduling strategy used across Airflow not being able to handle a variety of edge cases. We first pull at most `DEFAULT_DAGRUNS_TO_EXAMINE` DAG runs whose DAGs are not yet maxed out: https://github.com/apache/airflow/blob/e4957ff3827e0aea0465026023dd58288c5b1299/airflow-core/src/airflow/models/dagrun.py#L602C13-L611C14 And then go over each one of them increasing the currently scheduled DAGrun count and dropping once we can't stuff any more runs: https://github.com/apache/airflow/blob/65be581ac7715d3765af4a3b99faea26e5da55fc/airflow-core/src/airflow/jobs/scheduler_job_runner.py#L1740C9-L1740C33 https://github.com/apache/airflow/blob/65be581ac7715d3765af4a3b99faea26e5da55fc/airflow-core/src/airflow/jobs/scheduler_job_runner.py#L1762C13-L1773C29 It's indeed the same as https://github.com/apache/airflow/issues/45636, but now with DAG runs instead of tasks. If the original issue can be efficiently solved with windowing, I guess this one can also benefit from it, like: 1. Window partitioned by DAG 2. For each DAG compute the number of runs 3. Stuff new runs as far as there are free slots (per DAG) 4. Put a limit on the result (which can be far less restrictive, as we delegate more work to SQL and the cost of a query won't go up as fast as throughput will). As windowing may introduce additional overhead which sometimes won't result in higher throughput, assuming there aren't so many DAG runs in the system, I'm looking for a compromise between the optimistic and the windowing (pessimistic) approaches. It's possible to have it configurable, though it will make the system harder and harder to tweak to everyone's needs, which is not desirable. I hope the windowing can be done in such a way that will be negligible for the query time, by the help of correct indexing, or completely changing some core concurrency parameters, like introducing `DEFAULT_DAGS_TO_EXAMINE` in favor of `DEFAULT_DAGRUNS_TO_EXAMINE`, where the new parameter limits the number of windows in the query. Some research is needed to find out. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
