alamb commented on issue #9370:
URL: 
https://github.com/apache/arrow-datafusion/issues/9370#issuecomment-1968743335

   >  One thing I can think of is that it turns any operator to a possibly 
delay introducing operator (where it wasn't otherwise), 
   
   This is a good point. So if we introduced coalscing within the operator it 
may result in buffering that is not obvious from the plan. For streaming use 
cases this could be a substantial problem so perhaps we can not remove 
`CoalesceBatchesExec`
   
   I think one potential first step could be to create the `BatchCoalscer` as 
suggested by refactoring (not yet removing) the code in  
`CoalesceBatchesStream`:  
https://github.com/apache/arrow-datafusion/blob/e62240969135e2236d100c8c0c01546a87950a80/datafusion/physical-plan/src/coalesce_batches.rs#L176-L280
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to