Dandandan opened a new issue, #21719:
URL: https://github.com/apache/datafusion/issues/21719

   ### Is your feature request related to a problem or challenge?
   
   Currently we execute the physical plan using mostly a Volcano model.
   
   Scans also currently (mostly)is executed with local aggregation state as 
each partition is spawned on the tokio runtime (and we usually use the number 
of threads by default).
   
   We don't control thread locality everywhere, as we spawn tasks for each 
partition, tokio is free to move/steal them which might increase cross-thread 
movement.
   
   ### Describe the solution you'd like
   
   Implement a morsel/pipeline-driven scheduler:
   
   https://github.com/apache/datafusion/pull/2226 implemented this before but 
was removed after some time, probably it didn't show enough traction at that 
time.
   
   ### Describe alternatives you've considered
   
   Just keep/improve the current execution model and make it work better within 
the multi-threaded Tokio runtime.
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to