Dandandan opened a new issue, #21719: URL: https://github.com/apache/datafusion/issues/21719
### Is your feature request related to a problem or challenge? Currently we execute the physical plan using mostly a Volcano model. Scans also currently (mostly)is executed with local aggregation state as each partition is spawned on the tokio runtime (and we usually use the number of threads by default). We don't control thread locality everywhere, as we spawn tasks for each partition, tokio is free to move/steal them which might increase cross-thread movement. ### Describe the solution you'd like Implement a morsel/pipeline-driven scheduler: https://github.com/apache/datafusion/pull/2226 implemented this before but was removed after some time, probably it didn't show enough traction at that time. ### Describe alternatives you've considered Just keep/improve the current execution model and make it work better within the multi-threaded Tokio runtime. ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
