thinkharderdev opened a new issue, #30:
URL: https://github.com/apache/arrow-ballista/issues/30

   No that Ballista is it’s own top-level project, I want to extend a previous 
discussion around what exactly Ballista is (previous discussion 
https://github.com/apache/arrow-datafusion/issues/1916)
   
   I believe the consensus from that discussion was that Ballista should be a 
standalone system but in practice I think we have adopted somewhat of a “both 
and” approach in the sense that we have added several extension point to 
Ballista (while also providing default implementations that allow you to run 
Ballista out-of-the-box). I think this is a reasonable direction and agree that 
it is important that Ballista be something you can use “as is”.
   
   That raises the question of what are the use cases for Ballista that we 
would like to optimize for? 
   
   To take a concrete example, I think the current architecture is optimized 
for batch processing using a straightforward map-reduce implementation. This 
has many advantages:
   * Scheduling is relatively straightforward.
   * Cluster utilization is efficient since the unit of schedulable work is a 
single task, so whenever a task slot frees up you can just schedule a pending 
task on it. 
   * It is resilient to task failures since all intermediate results are 
serialized to disk so recovering from a spurious task failure is just a matter 
of rescheduling the task. 
   
   However, for non-batch oriented work it has some serious drawbacks
   * Results can only be returned once the entire query finishes. We can’t 
start streaming results as soon as they are available. 
   * Stages are scheduled sequentially so each stage is bound by its slowest 
partition. 
   * Queries that return large resultsets can be quite resource intensive as 
each partition must serialize its entire shuffle result to disk 
   
   There has already been some excellent proof-of-concept work done on a 
different scheduling paradigm in 
https://github.com/apache/arrow-datafusion/pull/1842 (something my team hopes 
to help push forward in the near future) but this raises some questions of it’s 
own, Namely, when do we use streaming vs map-reduce execution? In the PoC it is 
a very simple heuristic but in real uses cases I’m not sure there is a 
one-size-fits-all solution. In some cases you may want to use only streaming 
execution and use auto-scaling or dynamic partitioning to make sure each query 
can be scheduled promptly. Or you may only care about resource utilization and 
want to disable streaming execution entirely. 
   
   This is one particular example but you can imagine many others. 
   
   To bring this back around to some concrete options, I think there are a few 
different ways we can go:
   1. Ballista is a distributed computing framework which can be customized for 
many different use cases. There are default implementations available which 
allow you to use Ballista as a standalone system but the internal 
implementation is defined in terms of interfaces which allow for customized 
implementations as needed. 
   2. Ballista is a standalone system with limited customization capabilities 
and is highly optimized for its intended use case. You can plugin certain 
things (ObjectStore implementations, UDFs, UDAFs, etc) but the core 
functionality (scheduler, state management, etc) is what it is and if it 
doesn’t work for your use case then this is not the right solution for you. 
   3. Ballista is a standalone system which is highly configurable but not 
highly extensible. It ships with, for instance, schedulers optimized for 
different use cases which can be enabled through runtime configuration (or 
build-time features) but you can’t just plugin your own custom scheduler 
implementation (without upstreaming it :)). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to