alamb commented on a change in pull request #775: URL: https://github.com/apache/arrow-datafusion/pull/775#discussion_r677347657
########## File path: ballista/README.md ########## @@ -35,9 +35,30 @@ Ballista can be deployed as a standalone cluster and also supports [Kubernetes]( case, the scheduler can be configured to use [etcd](https://etcd.io/) as a backing store to (eventually) provide redundancy in the case of a scheduler failing. +# Getting Started + +Fully working examples are available. Refer to the [Ballista Examples README](../ballista-examples/README.md) for +more information. + +## Distributed Scheduler Overview + +Ballista uses the DataFusion query execution framework to create a physical plan and then transforms it into a +distributed physical plan by breaking the query down into stages whenever the partitioning scheme changes. + +Specifically, any `RepartitionExec` operatoris is replaced with an `UnresolvedShuffleExec` and the child operator Review comment: oh man -- good eyes. You are right -- how about this for a suggestion: ```suggestion Specifically, any `RepartitionExec` operator is replaced with an `UnresolvedShuffleExec` and the child operator ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
