Hi Stephen, We are currently stress testing Airflow for use in a multi-master setup. One of my team members is doing a write up that should show up online shortly. TL;DR; in its current state Airflow will need some patches in order to run concurrently. One issue is that Airflow can have a database deadlock which will stop the scheduler from running. I have a patch for that out here (https://github.com/apache/incubator-airflow/pull/2267 <https://github.com/apache/incubator-airflow/pull/2267>) that works fine on Postgres/MySql (tests don’t pass on sqlite yet due to limitations of sqlite).
Your global scheduler lock (eg. by an active passive configuration) might make most sense for now. Bolke > On 22 May 2017, at 07:52, Stephen Rigney <[email protected]> wrote: > > Hi, > > We're running airflow in production, but for reliability (n.b. not > performance) we'd like to confirm if it is safe to spawn multiple instances > of the scheduler overlapping in time (otherwise we may need to put more > effort into assuring two copies aren't ever spawned at once in our > environment). > > > It seems this officially wasn't a supported configuration back in 2015 ( > https://groups.google.com/d/msg/airbnb_airflow/-1wKa3OcwME/uATa8y3YDAAJ ), > but has sufficient intra-airflow locking been added that it is now safe to > start up two temporally overlapping instances of the scheduler for the same > airflow system? > > > Or should we hack in a "global scheduler lock" - we're not looking for > increased performance by scheduler parallelism, just that if we ever fire > up two instances of the scheduler nothing terrible happens? > > > Stephen
