Hello. Me and Asquator have already been through this issue, and we have, what we think, is a decent implementation of pluggable task selection algorithm for airflow. (which we have implemented here https://github.com/Asquator/airflow/tree/feature/pessimistic-task-fetching-with-window-function )
I agree that no perfect solution will ever exist in airflow for all use cases, regarding task selection, which is why this is probably a necessity more than a Nice To Have feature. In the current way we implemented it, we can have a few pre implemented algorithms, that solve different issues, as not all users will encounter all issues, and by making them pluggable correctly, with a configuration, we can include the documentation on when to use a specific task selection algorithm, just like Jarek Potiuk proposed. it will not be customizable, but rather injectable inside of the airflow-core package. Of course there are risks that come along with it, like users abusing it and trying to create a new task selection algorithm for each edge case or use case they have, which can become hard to maintain and follow, however, I do not agree that it makes it harder to maintain (in terms of code amount), or easier to make mistakes, though, if implemented correctly, each task selector is independent, and acts as a black box, has a simple api, and can be interchanged without any code changes, which makes it, in my opinion, easier to maintain existing algorithms, and removes the need to change a single big and sloppy file (as it is now). In fact, I am certain that making it pluggable will simplify the scheduler altogether as now, different parts will be clearly separated in different files and directories. Allowing the injectable algorithms, does give more flexibility, and can even make adding the new priority weights algorithm quite simple, and not cause any massive changes. The main downside is that we have to choose an api very carefully, as when we add it, it will be exceptionally hard to change it, as it would mean changing it in multiple places, and so it would be considered a breaking change. On Mon, 1 Sept 2025 at 18:36, Christos Bisias <christos...@gmail.com> wrote: > Hello, > > A while back I started a discussion on the mailing list regarding making > some changes to the task selection query in order to improve the > scheduler's throughput. > > https://github.com/apache/airflow/pull/54103 > > Another topic came up during that discussion related to task starvation due > to the current selection algorithm. There are two open PRs with different > fixes for that issue. > > https://github.com/apache/airflow/pull/54284 > > https://github.com/apache/airflow/pull/53492 > > Everyone has his own needs and it's probable that a good number of users > won't experience the starvation issue. > > Each approach has its own advantages and disadvantages and for that reason > it doesn't feel like there is a right or wrong approach here or a single > solution for all. > > There have been papers on task selection algorithms like this one > > https://ieeexplore.ieee.org/document/9799199 > > I would like to suggest refactoring the scheduler so that the task > selection algorithm can be pluggable. The current implementation will be > the default. Everyone will be able to configure the path to his own class. > That will be the most beneficial to the majority of users. > > In the future, anyone could create a PR with his implementation and if > enough people like it, it could be added to the repo. > > This has already been done for the priority weights algorithm, so why not > in this case as well? > > > https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/priority-weight.html#custom-weight-rule > > If there is positive feedback on this idea, I would like to implement it. > > Please let me know what you think. Thank you! > > Regards, > Christos >