alamb commented on issue #9792:
URL: 
https://github.com/apache/arrow-datafusion/issues/9792#issuecomment-2037489054

   I see -- thank you @berkaysynnada  
https://github.com/apache/arrow-datafusion/issues/9792#issuecomment-2037425563 
makes sense
   
   Something still feels a little off with limiting in CoalesceBatches as it 
seems it would always be better to do the fetch below that ExecutionPlan
   
   For example, in this plan it seems like it would be best to have the 
Aggregate stop after 5 rows:
   
   ```
   Limit: fetch=5
   --CoalesceBatches: target_size=1000
   ----Aggregate: to produce 5 rows, needs 500 rows <--- should stop after it 
has created 5 rows. 
   ------StreamingTableExec
   ```
   
   This looks like there is something similar:
   
https://github.com/apache/arrow-datafusion/blob/63888e853b7b094f2f47f53192a94f38327f5f5a/datafusion/physical-plan/src/aggregates/row_hash.rs#L272-L276
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to