alamb commented on issue #18782:
URL: https://github.com/apache/datafusion/issues/18782#issuecomment-3563395564

   > Coalesce batches in output stream when poll_next is called, e.g.
   
   This would be my personal suggestion as a place to start because:
   1. It (should) be the same effect as running `CoalesceBatches` in 
`RepartitionExec` . (if you coalesce on input it may affect the execution order 
and affect performance in unexpected ways)
   2. It sounds relatively simple
   
   > If coalescing in e.g. PerPartitionStream::poll_next_inner, it also means 
larger input batches for merge-sort.
   
   This is a good thing I think
   
   I don't have any great suggestion about if a new stream would be better than 
integrating the coalescer into the existing streams -- i think we would have to 
try them both out and see which looked best
   
   Thank you for working on this @jizezhang  🙏 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to