Hi all,
I have done some investigation regarding high availability of the
scheduler since it is crucial for our deployment. I would like to share
the results of my investigation.
I have found out that there is a solution for it - see [1], [2]. After
closer look, I have found out that it is using SSH for checking whether
the scheduler is running on the other node. For our use case, this is
not optimal solution since we don't want to have SSH traffic between the
nodes. After that, I have found out that the HA cluster can be used to
get a failover solution for the scheduler. It seems that consul [3] is
very easy to use solution. I was able to create such HA cluster (using
consul lock) very quickly. I have done some tests with such cluster
consisting of 3 nodes and it turns out that it works great.
I was missing any information about such topic in the airflow
documentation. For someone (like me) who does have no experience with HA
clusters it can be difficult to find out how such HA cluster can be
deployed. Maybe in future, I would like to create some documentation
about it. Do you think that it would be helpful contribution to the project?
[1] https://github.com/teamclairvoyant/airflow-scheduler-failover-controller
[2]
https://www.slideshare.net/RobertSanders49/airflow-clustering-and-high-availability
[3] https://www.consul.io/
Thanks,
Matus
I have done some research in this topic and I would like to share some
results with you.
On 02/09/2017 03:47 PM, matus valo wrote:
Hi all,
I am considering deployment of airflow as pipeline framework. I have
found out multiple articles explaining deployment of airflow in
distributed environment (e.g. [1]). Unfortunately, I was not able to
find out any use case where scheduler is deployed distributed on
multiple nodes. Is it possible to have scheduler distributed on
multiple nodes to prevent single point of failure? I haven’t found any
mention about it in documentation. I have found out in [2] that it is
not possible but on the other hand in [3] is reference that this can
be solved in new version of airflow.
Thanks,
Matus
[1] http://site.clairvoyantsoft.com/setting-apache-airflow-cluster/
[2] https://groups.google.com/forum/#!topic/airbnb_airflow/-1wKa3OcwME
<https://groups.google.com/forum/#%21topic/airbnb_airflow/-1wKa3OcwME>
[3] https://issues.apache.org/jira/browse/AIRFLOW-678