ephraimbuddy edited a comment on issue #18023:
URL: https://github.com/apache/airflow/issues/18023#issuecomment-912822201
> This can potentially impact the time to move a DagRun from queued to
running, as the _start_queued_dagruns function has to iterate through the
entire set of queued DagRuns (looking at max_dagruns_per_loop_to_schedule at a
time) before it re-examines a given DagRun.
It doesn't go through the entire `dagruns`. It goes through a subset
controlled by `max_dagruns_per_loop_to_schedule`. And `dagruns` returned are
ordered by `last_scheduling_decision`, unless you are seeing a performance
impact?
> In 2.1.2, the scheduler would only create max_active_runs DAGRuns for a
single DAG, and then create additional runs as older runs were completed.
In 2.1.2, the Scheduler has to create that dagrun when it wants to start
running it while in 2.1.3, the scheduler only updates the state when it wants
to start running it.
`max_active_runs` is for dagruns in `running` state.
By default, at each scheduler loop, 20 `dagruns` are created in `queued`
state and 16 updated to `running` state if `max_active_runs` is not reached. If
the `max_active_runs` have been reached, the `last_scheduling_decision` is
updated for collected dagruns.
If you have many `dagruns`, creating 20 at each loop won't have any impact
on the DB if you ask me. At a point, no dagruns are created further because the
needed `dagruns` have been created in `queued` state.
Then the scheduler would only update 16 `dagruns` state at each loop if
necessary and update the `last_scheduling_decision` for `dagruns` not updated
to running state.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]