It really depends on the scenario. I believe it is impossible to get a straight answer from anyone, simply because there isn't a "general" answer to that. You really have to test it and see what works for you.
You can find for example this blog post describing how it behaved in some scenarios: https://www.astronomer.io/blog/airflow-2-scheduler And you can see this doc about fine-tuning the performance: https://airflow.apache.org/docs/apache-airflow/stable/concepts/scheduler.html#fine-tuning-your-scheduler-performance As you will see in the docs - there are so many variables (starting from filesystem, database choice, database performance and optimisation, latency - ending at the structure of your DAGs, how many dag files vs. DAG you have but most of all the way they are written and optimised) that the answer is pretty much always "it depends". You need to look at your case, verify, experiment and see what works best for your case, identify bottlenecks you have, look at the "fine tuning" doc and knowing what your bottleneck you can fine-tune your performance (but it will always be "your deployment" characteristics which can be different from "others") - the process of fine-tuning is explained in the doc I linked. J.. On Fri, Nov 19, 2021 at 5:27 PM Nicolas Paris <[email protected]> wrote: > > hi > > the HA RFC[1] had two objectives: > > 1. HA > 2. performances scalability > > > Can anyone confirm that adding multiple schedulers can improve > performances in the case people having HUGE number of dags (> 500) ? > > > Our finding is that increasing the ressource on a standalone scheduler > performs better than adding scaling them horizontally (partly due to > database impact of locks) > > Thanks > > > [1]: > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103092651
