[PR] slt+accounting: surface GroupValuesRows::emit untracked decode (closes #22739) [datafusion]

via GitHub Wed, 03 Jun 2026 11:33:12 -0700


avantgardnerio opened a new pull request, #22741:
URL: https://github.com/apache/datafusion/pull/22741


   ## Which issue does this PR close?
   
   Closes #22739. Follow-up in the family of #22721 / #22723 — same framework 
(#22626), different operator.
   
   ## Rationale for this change
   
   `GroupedHashAggregateStream`'s spill path goes through 
`GroupValuesRows::emit` 
(`datafusion/physical-plan/src/aggregates/group_values/row.rs:198`) → 
`RowConverter::convert_rows` (`arrow-row/src/lib.rs:806`) → `decode_column` 
(`lib.rs:1675`) → per-type decoders in `arrow-row/src/variable.rs` and 
`arrow-row/src/list.rs`. Every decoder calls `MutableBuffer::with_capacity` (or 
equivalent `Vec::with_capacity`) sized to the column's row-encoded byte total, 
with no call to `MemoryReservation::try_grow`. The bytes show up in process 
resident but never in `MemoryPool::reserved()`, leaking past the declared 
budget.
   
   At production scale this is a real OOM source — see #22739 for the incident 
write-up (high-cardinality `groupby` on a free-text log message field that took 
down 79 executor pods in 2 minutes). The same `GroupValuesRows::emit` → 
`convert_rows` chain is in that incident's stack trace.
   
   The leak isn't visible at upstream's current `HEADROOM_FACTOR = 8.0`. 
Tightening to 5.0 (same direction and rationale as #22721) is what makes the 
SLT fail with `allocator overdraft` instead of silently passing.
   
   ## What changes are included in this PR?
   
   Two changes:
   
   1. **`HEADROOM_FACTOR: f64 = 8.0` → `5.0`** in 
`datafusion/sqllogictest/src/accounting_pool.rs`. Tighter framework slack so 
untracked allocations surface as test failures sooner. Same shape as #22721.
   
   2. **New SLT `group_by_spill_row_decode.slt`** that exercises the 
row-encoded `GroupValues` path. Uses a wide Utf8 key paired with a small 
`List<Int>` "schema poisoner" so the schema falls outside 
`multi_group_by::supported_type` and routes through `GroupValuesRows` 
(single-column Utf8 alone would route to `GroupValuesBytes`). At pool=1M with 
`HEADROOM_FACTOR=5.0` the test fails with `allocator overdraft: account balance 
at panic = -1344326 bytes`, stack frames pointing at 
`arrow_row::variable::decode_binary_view_inner` ← `decode_string_view` ← 
`decode_column` ← `RowConverter::convert_rows` ← `GroupValuesRows::emit` ← 
`GroupedHashAggregateStream::emit` ← `spill`.
   
   ## Are these changes tested?
   
   CI is the test — `group_by_spill_row_decode.slt` is expected to **fail** 
under `--features memory-accounting` until the operator-side leak is fixed. 
That failure is the intended signal of this PR (sibling to #22721 in that 
respect), not a regression.
   
   The HEADROOM tightening also revives the previously-passing 
`nested_loop_join_spill.slt` failure mode that #22721 surfaces, so this PR 
effectively bundles a tighter SLT baseline.
   
   ## Are there any user-facing changes?
   
   No. Test-infrastructure-only.
   
   ## Proposed fixes for the bug surfaced
   
   See #22739 for the two options (point fix in `GroupValuesRows::emit` vs. 
systemic `OomGuardMemoryPool` approach, the latter mirroring @andygrove's 
[Comet prototype](https://github.com/apache/datafusion-comet/pull/4582)).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] slt+accounting: surface GroupValuesRows::emit untracked decode (closes #22739) [datafusion]

Reply via email to