andygrove opened a new pull request #8283: URL: https://github.com/apache/arrow/pull/8283
This PR introduces a scheduler for query execution that breaks a physical plan down into a DAG of query stages based on changes in partitioning. Each stage represents a portion of the query where partitions (tasks) can be executed in parallel on a thread pool. The intent is for the scheduler to decide how to allocate tasks to threads/cores and move all the threading logic out of the executors themselves. The code is based on a working prototype that I had previously implemented in Ballista (also ASL 2.0) and myself and @jorgecarleitao have been the only contributors to this code so far and we both have signed CLAs on file. The current code compiles but is not complete and doesn't actually work yet. I will try and get this fully working for the 2.0.0 release if others think this is a good approach. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
