thinkharderdev commented on issue #803:
URL: https://github.com/apache/arrow-ballista/issues/803#issuecomment-1585613312

   > I just want to be able to deploy multiple schedulers to ensure high 
availability of the scheduler. @thinkharderdev
   
   So you have two options:
   
   1. Out of the box support for HA scheduler. As @avantgardnerio as long as 
you configure a storage backend that can be shared between the schedulers then 
this should work out of the box. There is already a storage backend implemented 
with etcd that you can use out of the box, but implementing a custom backend is 
relatively straightforward if you want to use some other DB or KV store. 
However the shared storage and distributed locking can add a significant amount 
of overhead. 
   2. If you need high throughput on task scheduling then you can implement an 
API layer in front of the scheduler that can route calls to the correct 
scheduler and then have schedulers use only in-memory state. The API layer 
would need to know which scheduler "owns" each query and route status requests 
to the correct scheduler. 
   
   Option 2 is what we have done in our deployment. We have multiple 
schedulers, each using an in-memory `JobState` and an API layer in front which 
routes calls to the appropriate scheduler. We also use a shared `ClusterState` 
based on redis (not yet upstreamed but it is relatively straightforward to 
implement). This gives all the schedulers a consistent view of the executor 
task slots and with a little bit of redis server-side scripting doesn't require 
any distributed locks. 
   
   One downside of this approach is that the job state is volatile so if a 
scheduler dies then all jobs running on it are lost. If you are running 
relatively short-duration queries then this is not a huge issue (at least for 
us) since the scheduler will try and complete any in-flight jobs before it 
shuts down so you can set up your deployment such that the schedulers have a 
shutdown grace period sufficient to complete any outstanding work.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to