thinkharderdev commented on issue #803: URL: https://github.com/apache/arrow-ballista/issues/803#issuecomment-1585613312
> I just want to be able to deploy multiple schedulers to ensure high availability of the scheduler. @thinkharderdev So you have two options: 1. Out of the box support for HA scheduler. As @avantgardnerio as long as you configure a storage backend that can be shared between the schedulers then this should work out of the box. There is already a storage backend implemented with etcd that you can use out of the box, but implementing a custom backend is relatively straightforward if you want to use some other DB or KV store. However the shared storage and distributed locking can add a significant amount of overhead. 2. If you need high throughput on task scheduling then you can implement an API layer in front of the scheduler that can route calls to the correct scheduler and then have schedulers use only in-memory state. The API layer would need to know which scheduler "owns" each query and route status requests to the correct scheduler. Option 2 is what we have done in our deployment. We have multiple schedulers, each using an in-memory `JobState` and an API layer in front which routes calls to the appropriate scheduler. We also use a shared `ClusterState` based on redis (not yet upstreamed but it is relatively straightforward to implement). This gives all the schedulers a consistent view of the executor task slots and with a little bit of redis server-side scripting doesn't require any distributed locks. One downside of this approach is that the job state is volatile so if a scheduler dies then all jobs running on it are lost. If you are running relatively short-duration queries then this is not a huge issue (at least for us) since the scheduler will try and complete any in-flight jobs before it shuts down so you can set up your deployment such that the schedulers have a shutdown grace period sufficient to complete any outstanding work. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org