avantgardnerio opened a new pull request, #22741: URL: https://github.com/apache/datafusion/pull/22741
## Which issue does this PR close? Closes #22739. Follow-up in the family of #22721 / #22723 — same framework (#22626), different operator. ## Rationale for this change `GroupedHashAggregateStream`'s spill path goes through `GroupValuesRows::emit` (`datafusion/physical-plan/src/aggregates/group_values/row.rs:198`) → `RowConverter::convert_rows` (`arrow-row/src/lib.rs:806`) → `decode_column` (`lib.rs:1675`) → per-type decoders in `arrow-row/src/variable.rs` and `arrow-row/src/list.rs`. Every decoder calls `MutableBuffer::with_capacity` (or equivalent `Vec::with_capacity`) sized to the column's row-encoded byte total, with no call to `MemoryReservation::try_grow`. The bytes show up in process resident but never in `MemoryPool::reserved()`, leaking past the declared budget. At production scale this is a real OOM source — see #22739 for the incident write-up (high-cardinality `groupby` on a free-text log message field that took down 79 executor pods in 2 minutes). The same `GroupValuesRows::emit` → `convert_rows` chain is in that incident's stack trace. The leak isn't visible at upstream's current `HEADROOM_FACTOR = 8.0`. Tightening to 5.0 (same direction and rationale as #22721) is what makes the SLT fail with `allocator overdraft` instead of silently passing. ## What changes are included in this PR? Two changes: 1. **`HEADROOM_FACTOR: f64 = 8.0` → `5.0`** in `datafusion/sqllogictest/src/accounting_pool.rs`. Tighter framework slack so untracked allocations surface as test failures sooner. Same shape as #22721. 2. **New SLT `group_by_spill_row_decode.slt`** that exercises the row-encoded `GroupValues` path. Uses a wide Utf8 key paired with a small `List<Int>` "schema poisoner" so the schema falls outside `multi_group_by::supported_type` and routes through `GroupValuesRows` (single-column Utf8 alone would route to `GroupValuesBytes`). At pool=1M with `HEADROOM_FACTOR=5.0` the test fails with `allocator overdraft: account balance at panic = -1344326 bytes`, stack frames pointing at `arrow_row::variable::decode_binary_view_inner` ← `decode_string_view` ← `decode_column` ← `RowConverter::convert_rows` ← `GroupValuesRows::emit` ← `GroupedHashAggregateStream::emit` ← `spill`. ## Are these changes tested? CI is the test — `group_by_spill_row_decode.slt` is expected to **fail** under `--features memory-accounting` until the operator-side leak is fixed. That failure is the intended signal of this PR (sibling to #22721 in that respect), not a regression. The HEADROOM tightening also revives the previously-passing `nested_loop_join_spill.slt` failure mode that #22721 surfaces, so this PR effectively bundles a tighter SLT baseline. ## Are there any user-facing changes? No. Test-infrastructure-only. ## Proposed fixes for the bug surfaced See #22739 for the two options (point fix in `GroupValuesRows::emit` vs. systemic `OomGuardMemoryPool` approach, the latter mirroring @andygrove's [Comet prototype](https://github.com/apache/datafusion-comet/pull/4582)). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
