It could potentially be done, but I believe it would require quite a bit of work on re-architecting the way on how the database is accessed by Airflow and is not compatible with Airflow Architectes - we would probably not be able to gain significant scaling improvements anyway.
For now either Scheduler or Workers can both read and write to the DB, there are no pieces of Airflow that we can clearly separate as "read-only". The pattern you describe Valeriy makes sense if you have some components of the system that can rely on read-only access and then they could indeed read from the replicas. But to my understanding how the database is used in Airflow, this is not really the pattern we can have easily. Airflow currently uses a lot of DB locks as the HA/scalability mechanisms and the logic of DB writes is distributed among internal scheduler components as well as workers - it means that a lot of queries/DB access are read-write rather than read-only. In case of leader-> follower replication all writes have to go to the leader so if you do not have clearly separated "read-only" components, it brings no benefits at all. Simply not all software architectures play well with such leader-> follower patterns. I would love to understand Valeriy - do you have any observations and statistics that led you to the conclusion that we could improve horizontal scaling by using read replicas? Is it evidence based or more of a "general" statement without prior looking at how Airflow uses the DB? Do you have an example of queries/patterns of airflow usage of the DB that could benefit from it? I'd love to hear your experiences with it? I think even if there are such cases in Airflow where we have vast amounts of reads coming from one place without accompanying writes, they might come from mis-use of Airflow rather than from the internal "Db access architecture". For example there is one usage pattern that potentially generates a lot of reads, but this is actually an anti-pattern of Airflow usage. If you are using VARIABLES in the DAG construction phase, that can lead to a lot of reads generated for the variables table while DAGs are continuously being parsed by Airflow Scheduler, but this is definitely an Anti-Pattern http://airflow.apache.org/docs/apache-airflow/stable/best-practices.html#variables One other place that "could" potentially benefit from using read-replicas is the webserver. Webserver is "mostly read" by its nature. However Airflow's webserver "traffic" to databse is super-small comparing to the traffic generated by scheduler and workers. There are usually just a handful of people using the Airflow webserver and the frequency/amount of queries generated by combined usage of all the users would be few orders of magnitude less than the rest (I believe) so making it read-replica-aware would bring very little improvements (if any) J. On Sun, Mar 14, 2021 at 11:58 AM Xiaodong Deng <[email protected]> wrote: > This is definitely a good question. But to my understanding/opinion, this > should be addressed at the database side, and Airflow does not need to know > anything behind it or make any change (it simply accepts a DB connection > URI as *sql_alchemy_conn* in the configuration). > > Some related infor/discussions I can find, for your information: > - Postgres: > https://www.postgresql.org/message-id/CANu8Fiy_TFdFAbCt-snjwFP%2BJJs%2B%2B08r1X5Aincf-rb1ikDW%2Bg%40mail.gmail.com > - MariaDB MaxScale (NOTE: MariaDB is a fork of MySQL; MaxScale is > MariaDB's proxy extension; For Airflow, MariaDB is NOT tested/recommended. > Here I just take this as an example for your question): > https://mariadb.com/kb/en/mariadb-maxscale-24-readwritesplit/#readwritesplit-routing-decisions > > > Let me know if I missed or misunderstood anything. > > > XD > > On Sun, Mar 14, 2021 at 10:41 AM Valeriy Solovyov <[email protected]> > wrote: > >> Hello, >> >> It would be nice if we will be able to offload the read load from DB >> Leader by horizontal scaling, which relies on adding more database >> servers to a cluster to fulfill requests and splitting traffic between them. >> >> For example, we have the database Mysql Leader. Leader StandBy and 5 >> Followers. >> And HAProxy exposes Readony access to databases on one port and ReadWrite >> access on another. >> >> >> -- +48 660 796 129
