hhhizzz opened a new pull request, #23182:
URL: https://github.com/apache/datafusion/pull/23182
## Which issue does this PR close?
short-term solution for #23178.
## Rationale for this change
PR #23055 changed final hash aggregate output to emit groups incrementally
with
`EmitTo::First(batch_size)`. For terminal final aggregate output, this can
cause
the group value state to be repeatedly compacted while output batches are
being
produced. On TPC-DS q23 this showed up as a significant regression.
This PR implements the short-term approach discussed in #23178: materialize
the
final aggregate output once, then return slices of that materialized
`RecordBatch` according to `batch_size`.
This avoids changing the `GroupValues` API while preserving bounded
downstream
batch sizes.
## What changes are included in this PR?
- Adds an `OutputtingMaterialized` hash aggregate state.
- Adds `MaterializedOutput`, a small wrapper around a `RecordBatch` plus
output
offset.
- Changes final hash aggregate output to:
- emit all final groups once,
- evaluate all final aggregate values once,
- slice the materialized batch for subsequent output polling.
- Leaves partial aggregate output behavior unchanged.
- Adds focused tests for materialized output slicing and final hash aggregate
output state transitions.
## Performance
TPC-DS SF10 full 99 queries, 10 rounds:
- Total runtime ratio: `0.857051`
- Geomean ratio: `0.976652` (~2.4% faster)
- q23 ratio: `0.313770` (~218.7% faster), faster in `10/10` rounds
Regressions over 5% were observed in 10 queries. Most have small absolute
deltas, but the largest slowdowns were:
- q67: `1.055907`, +170.996 ms
- q39: `1.060436`, +98.544 ms
- q9: `1.050135`, +37.858 ms
- q70: `1.061124`, +11.848 ms
- q35: `1.052392`, +9.386 ms
- q33: `1.063655`, +6.995 ms
- q98: `1.071688`, +6.515 ms
- q91: `1.109819`, +5.362 ms
- q15: `1.058356`, +5.072 ms
- q27: `1.057686`, +0.815 ms
Overall, this recovers the q23 regression strongly and improves full-query
geomean, but q39 and q67 are worth calling out as residual per-query
slowdowns.
## Testing
- `cargo fmt --all -- --check`
- `cargo test -p datafusion-physical-plan materializ`
- `cargo test -p datafusion-physical-plan aggregates::`
- TPC-DS SF10 q23, 3 rounds
- TPC-DS SF10 full 99 queries, 10 rounds
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]