andygrove opened a new issue #62: URL: https://github.com/apache/arrow-datafusion/issues/62
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Ballista provides its own execution context but uses the DataFusion DataFrame. Calling `collect` on the DataFrame will run the query in-memory rather than distributed and Ballista users must instead extract the logical plan from the DataFrame and call `BallistaContext.collect` instead. This is not good UX. **Describe the solution you'd like** As a user, I would just like to call `DataFrame.collect()` and have it run either in-memory or distributed depending on how I created the context. I think the way to do this is by making it possible to customize `ExecutionContext` and override the behavior when a DataFrame is collected. **Describe alternatives you've considered** None **Additional context** None -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
