alamb commented on issue #9792: URL: https://github.com/apache/arrow-datafusion/issues/9792#issuecomment-2037489054
I see -- thank you @berkaysynnada https://github.com/apache/arrow-datafusion/issues/9792#issuecomment-2037425563 makes sense Something still feels a little off with limiting in CoalesceBatches as it seems it would always be better to do the fetch below that ExecutionPlan For example, in this plan it seems like it would be best to have the Aggregate stop after 5 rows: ``` Limit: fetch=5 --CoalesceBatches: target_size=1000 ----Aggregate: to produce 5 rows, needs 500 rows <--- should stop after it has created 5 rows. ------StreamingTableExec ``` This looks like there is something similar: https://github.com/apache/arrow-datafusion/blob/63888e853b7b094f2f47f53192a94f38327f5f5a/datafusion/physical-plan/src/aggregates/row_hash.rs#L272-L276 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
