Hello, A while back I started a discussion on the mailing list regarding making some changes to the task selection query in order to improve the scheduler's throughput.
https://github.com/apache/airflow/pull/54103 Another topic came up during that discussion related to task starvation due to the current selection algorithm. There are two open PRs with different fixes for that issue. https://github.com/apache/airflow/pull/54284 https://github.com/apache/airflow/pull/53492 Everyone has his own needs and it's probable that a good number of users won't experience the starvation issue. Each approach has its own advantages and disadvantages and for that reason it doesn't feel like there is a right or wrong approach here or a single solution for all. There have been papers on task selection algorithms like this one https://ieeexplore.ieee.org/document/9799199 I would like to suggest refactoring the scheduler so that the task selection algorithm can be pluggable. The current implementation will be the default. Everyone will be able to configure the path to his own class. That will be the most beneficial to the majority of users. In the future, anyone could create a PR with his implementation and if enough people like it, it could be added to the repo. This has already been done for the priority weights algorithm, so why not in this case as well? https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/priority-weight.html#custom-weight-rule If there is positive feedback on this idea, I would like to implement it. Please let me know what you think. Thank you! Regards, Christos
