acking-you opened a new issue, #11980: URL: https://github.com/apache/datafusion/issues/11980
### Is your feature request related to a problem or challenge? The current [CoalesceBatches](https://docs.rs/datafusion/latest/src/datafusion/physical_optimizer/coalesce_batches.rs.html#38) optimization rule only create [CoalesceBatchesExec](https://docs.rs/datafusion/latest/datafusion/physical_plan/coalesce_batches/struct.CoalesceBatchesExec.html) based on the batch_size configured in the [config struct](https://docs.rs/datafusion-common/41.0.0/src/datafusion_common/config.rs.html#238), which can cause issues in some cases involving limit operators. Consider the following scenario: When a rule-compliant operation includes a `limit` operator on top of `CoalesceBatchesExec`, and the `limit` value is less than the `batch_size`, the entire computation might be blocked until a full `Batch` is collected, even though the `limit` has already been reached. A possible operator tree: ```text SortExec: TopK(fetch=10), expr=[event_time@3 DESC] LocalLimitExec: fetch=100 CoalesceBatchesExec: target_batch_size=8192 FilterExec: event_time@3 = 10 TableScanExec ``` Of course, we also need to consider special cases, like if the limit operator is above SortExec, then limit shouldn't affect the batch_size value. ### Describe the solution you'd like The `target_batch_size` is determined based on the limit operator's value and the current parallelism. ### Describe alternatives you've considered When operators downstream of the limit operator require a full table scan (e.g., SortExec), batch_size is not handled specially. ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org