[GitHub] [arrow-datafusion] realno commented on pull request #1560: Introduce push-based task scheduling for Ballista

GitBox Wed, 19 Jan 2022 19:01:33 -0800


realno commented on pull request #1560:
URL: 
https://github.com/apache/arrow-datafusion/pull/1560#issuecomment-1017072068



   > > You mentioned your use case is mainly interactive queries. Out of 
curiosity, what does your current Spark deployment looks like? What are the 
biggest drivers (or pain points) for you to switch?
   > 
   > Hi @realno, actually our team has customized the Spark to serve 
interactive queries. The Spark cluster is long running with thrift server for 
receiving query requests. The main pain point is it's not fast enough to serve 
queries within seconds. That's the main reason that we hope to utilize a native 
distributed compute engine with zero copy shuffling for our case.
   
   That makes a lot of sense @yahoNanJing. We have exact the same setup for one 
use case. How many executors do you guys run?  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] realno commented on pull request #1560: Introduce push-based task scheduling for Ballista

Reply via email to