Dandandan opened a new pull request, #21732: URL: https://github.com/apache/datafusion/pull/21732
## Summary Scale `AggregateExec`'s per-batch emission size for `Partial` / `PartialReduce` modes to `batch_size * target_partitions` instead of `batch_size`. ## Motivation A typical multi-phase aggregation plan looks like: ``` PartialAgg → RepartitionExec(Hash, P) → CoalesceBatchesExec → FinalAgg ``` `RepartitionExec` hash-partitions each input batch into P output sub-batches, so with PartialAgg emitting at `batch_size`, each sub-batch lands at ~`batch_size / P`. `CoalesceBatchesExec` is inserted specifically to re-accumulate these tiny sub-batches back to `batch_size` before FinalAgg consumes them. This PR flips the approach: emit bigger upstream so the repartition naturally produces right-sized sub-batches on its own. - RepartitionExec still does exactly one hash + mask + take pass per output partition per input batch — same total work, amortized over a larger batch (fewer outer-loop iterations, fewer mpsc sends). - CoalesceBatchesExec effectively becomes a pass-through when its inputs are already sized correctly — no `concat_batches` copies on the hot path. - FinalAgg sees batches at the conventional `batch_size` as expected. Other aggregate modes (`Final`, `FinalPartitioned`, `Single`, `SinglePartitioned`) are unchanged since their output goes to the final consumer which expects conventional batch sizing. ## Tradeoffs - **Memory peak**: the in-flight batch between PartialAgg and RepartitionExec is `P×` larger. For typical ClickBench-scale outputs this is small; for extremely large group-by cardinalities it could matter. - **Backpressure granularity**: the mpsc queue between PartialAgg and RepartitionExec carries fewer, larger items — slightly coarser backpressure, but the total data flow is unchanged. ## Test plan - [x] `cargo check -p datafusion-physical-plan` - [x] `cargo clippy -p datafusion-physical-plan --all-targets -- -D warnings` - [x] `cargo fmt --all -- --check` - [x] `cargo test -p datafusion-physical-plan --lib aggregates::` (86 passed) - [ ] ClickBench / TPC-H wall-clock numbers (happy to post when the benchmark runner picks this up) 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
