alamb commented on PR #7379:
URL: 
https://github.com/apache/arrow-datafusion/pull/7379#issuecomment-1690597737

   @wiedld  and I spoke a bit this afternoon and I think the next steps for 
this PR are to get a query that shows significant performance improvements. I 
think the one in 
https://github.com/apache/arrow-datafusion/pull/7379#issuecomment-1690507812 is 
a good candidate
   
   I don't really understand the code in this PR yet, but the way I suggest 
trying to add more parallelism is by "buffering" the the streams so that rather 
than computing everything on demand with `poll_next` spawn an explicit  
tokio::task for each input stream that will try to pull the next input while 
the current task is merging the input. 
   
   
   Maybe @crepererum  or  @tustvold can help with a suggestion on how to do the 
"add buffering/new tasks" in a reasonable rust way


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to