Hi There!
TLDR: In fix PR https://github.com/apache/airflow/pull/61769 we came to
the point that it seems today in Airflow Core the "Deferred" state seems
to be counted inconsistently. I would propose to consistently count
"Deferred" into the counts of "Running".
Details:
* In Pools for a longer time (since PR
https://github.com/apache/airflow/pull/32709) it is possible to
decide whether tasks in deferred state are counted into pool
allocation or not.
* Before that Deferred were not counted into, which caused tasks being
in deferred potentially overwhelm backends which defesated the
purpose of pools
* Recently it was also seen that other limits we usually have on Dags
defined as following do not consistently include deferred into limits.
o max_active_tasks - `The number of task instances allowed to run
concurrently`
o max_active_tis_per_dag - `When set, a task will be able to limit
the concurrent runs across logical_dates.`
o max_active_tis_per_dagrun - `When set, a task will be able to
limit the concurrent task instances per Dag run.`
* This means at the moment defining a task as async/deferred escapes
the limits
Code references:
* Counting tasks in Scheduler on main:
https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/jobs/scheduler_job_runner.py#L190
* EXECUTION_STATES used for counting:
https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/ti_deps/dependencies_states.py#L21
o Here "Deferred" is missing!
Alternatives that I see:
* Fix it in Scheduler consistently that limits are applied counting
Deferred always in
* There might be a historic reason that Deferred is not counting in -
then a proper documentation would be needed - but I'd assume this
un-likely
* There are different opinions - then the behavior might need to be
configurable. (But personally I can not see a reason for having
deferred escaping the limits defined)
Jens