Hi all,

I have done some investigation regarding high availability of the scheduler since it is crucial for our deployment. I would like to share the results of my investigation.

I have found out that there is a solution for it - see [1], [2]. After closer look, I have found out that it is using SSH for checking whether the scheduler is running on the other node. For our use case, this is not optimal solution since we don't want to have SSH traffic between the nodes. After that, I have found out that the HA cluster can be used to get a failover solution for the scheduler. It seems that consul [3] is very easy to use solution. I was able to create such HA cluster (using consul lock) very quickly. I have done some tests with such cluster consisting of 3 nodes and it turns out that it works great.

I was missing any information about such topic in the airflow documentation. For someone (like me) who does have no experience with HA clusters it can be difficult to find out how such HA cluster can be deployed. Maybe in future, I would like to create some documentation about it. Do you think that it would be helpful contribution to the project?

[1] https://github.com/teamclairvoyant/airflow-scheduler-failover-controller
[2] https://www.slideshare.net/RobertSanders49/airflow-clustering-and-high-availability
[3] https://www.consul.io/

Thanks,


Matus

I have done some research in this topic and I would like to share some results with you.
On 02/09/2017 03:47 PM, matus valo wrote:

Hi all,

I am considering deployment of airflow as pipeline framework. I have found out multiple articles explaining deployment of airflow in distributed environment (e.g. [1]). Unfortunately, I was not able to find out any use case where scheduler is deployed distributed on multiple nodes. Is it possible to have scheduler distributed on multiple nodes to prevent single point of failure? I haven’t found any mention about it in documentation. I have found out in [2] that it is not possible but on the other hand in [3] is reference that this can be solved in new version of airflow.

Thanks,


Matus

[1] http://site.clairvoyantsoft.com/setting-apache-airflow-cluster/

[2] https://groups.google.com/forum/#!topic/airbnb_airflow/-1wKa3OcwME <https://groups.google.com/forum/#%21topic/airbnb_airflow/-1wKa3OcwME>

[3] https://issues.apache.org/jira/browse/AIRFLOW-678


Reply via email to