Dandandan opened a new pull request, #21732:
URL: https://github.com/apache/datafusion/pull/21732

   ## Summary
   
   Scale `AggregateExec`'s per-batch emission size for `Partial` / 
`PartialReduce` modes to `batch_size * target_partitions` instead of 
`batch_size`.
   
   ## Motivation
   
   A typical multi-phase aggregation plan looks like:
   
   ```
   PartialAgg → RepartitionExec(Hash, P) → CoalesceBatchesExec → FinalAgg
   ```
   
   `RepartitionExec` hash-partitions each input batch into P output 
sub-batches, so with PartialAgg emitting at `batch_size`, each sub-batch lands 
at ~`batch_size / P`. `CoalesceBatchesExec` is inserted specifically to 
re-accumulate these tiny sub-batches back to `batch_size` before FinalAgg 
consumes them.
   
   This PR flips the approach: emit bigger upstream so the repartition 
naturally produces right-sized sub-batches on its own.
   
   - RepartitionExec still does exactly one hash + mask + take pass per output 
partition per input batch — same total work, amortized over a larger batch 
(fewer outer-loop iterations, fewer mpsc sends).
   - CoalesceBatchesExec effectively becomes a pass-through when its inputs are 
already sized correctly — no `concat_batches` copies on the hot path.
   - FinalAgg sees batches at the conventional `batch_size` as expected.
   
   Other aggregate modes (`Final`, `FinalPartitioned`, `Single`, 
`SinglePartitioned`) are unchanged since their output goes to the final 
consumer which expects conventional batch sizing.
   
   ## Tradeoffs
   
   - **Memory peak**: the in-flight batch between PartialAgg and 
RepartitionExec is `P×` larger. For typical ClickBench-scale outputs this is 
small; for extremely large group-by cardinalities it could matter.
   - **Backpressure granularity**: the mpsc queue between PartialAgg and 
RepartitionExec carries fewer, larger items — slightly coarser backpressure, 
but the total data flow is unchanged.
   
   ## Test plan
   
   - [x] `cargo check -p datafusion-physical-plan`
   - [x] `cargo clippy -p datafusion-physical-plan --all-targets -- -D warnings`
   - [x] `cargo fmt --all -- --check`
   - [x] `cargo test -p datafusion-physical-plan --lib aggregates::` (86 passed)
   - [ ] ClickBench / TPC-H wall-clock numbers (happy to post when the 
benchmark runner picks this up)
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to