easontm commented on issue #13542:
URL: https://github.com/apache/airflow/issues/13542#issuecomment-911359911
Thanks @ephraimbuddy! That wasn't it but I did find something that _may_ be
issue-worthy. I added some custom logging to
`scheduler_job._start_queued_dagruns()` and noticed that the contents of
`dag_runs` was the same 20 DAGruns in every loop (default count from config
`max_dagruns_per_loop_to_schedule`). The DAGruns in question show up first when
ordering by the following code from `dagrun.next_dagruns_to_examine()`:
```
.order_by(
nulls_first(cls.last_scheduling_decision, session=session),
cls.execution_date,
)
```
This DAG is set to `max_active_runs=1` so all 20 examined queued DAGruns do
not change state (because there is another running already). The problem arises
because the `_start_queued_dagruns()` function (AFAIK) doesn't update
`last_scheduling_decision`, so every time the query is run to get next DAGruns,
the same ones appear (and continue to not be scheduled if the currently active
DAGrun for that DAG takes a long time -- and it continues so long as there are
more than 20 DAGruns queued).
I think the `last_scheduling_decision` column needs to be updated somewhere
here:
```
if dag.max_active_runs and active_runs >= dag.max_active_runs:
self.log.debug(
"DAG %s already has %d active runs, not moving any more runs to
RUNNING state %s",
dag.dag_id,
active_runs,
dag_run.execution_date,
)
```
I was able to get around this issue currently by simply increasing the
number of DAGruns handled per loop (and telling my users not to queue so many),
but perhaps it should still be addressed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]