[GitHub] [arrow-ballista] alamb commented on issue #30: [Discuss] Ballista Future Direction

GitBox Tue, 11 Oct 2022 06:04:27 -0700


alamb commented on issue #30:
URL: https://github.com/apache/arrow-ballista/issues/30#issuecomment-1274655630


   In terms of pipeline execution (at least in terms of a push based, pipelined 
execution model), I wanted to point out that @tustvold  investigated this 
approach in DataFusion (and figured out a way to reuse the current operators). 
See https://github.com/apache/arrow-datafusion/pull/2226 which added a 
scheduler under a feature flag
   
   Our eventual goal is to support running a plan on 100s of parquet files 
without having to fetch them all before (or concurrently). However, we 
currently have other things blocking this goal so additional work to the 
scheduler is on hold for now 
   
   You can find more detail on 
https://github.com/apache/arrow-datafusion/issues/2504
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-ballista] alamb commented on issue #30: [Discuss] Ballista Future Direction

Reply via email to