Andy Grove created ARROW-12255:
----------------------------------
Summary: [Rust] [Ballista] Integrate scheduler with DataFusion
Key: ARROW-12255
URL: https://issues.apache.org/jira/browse/ARROW-12255
Project: Apache Arrow
Issue Type: New Feature
Components: Rust - Ballista, Rust - DataFusion
Reporter: Andy Grove
Assignee: Andy Grove
Fix For: 5.0.0
The Ballista scheduler breaks a query down into stages based on changes in
partitioning int he plan, where each stage is broken down into tasks that can
be executed concurrently.
Rather than trying to run all the partitions at once, Ballista executors
process n concurrent tasks at a time and then request new tasks from the
scheduler.
This approach would help DataFusion scale better and it would be ideal to use
the same scheduler to scale across cores in DataFusion and across nodes in
Ballista.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)