zhuqi-lucas opened a new issue, #22701:
URL: https://github.com/apache/datafusion/issues/22701
### Is your feature request related to a problem or challenge?
`multi_group_by::supported_schema` is all-or-nothing: if **any single
column** in a `GROUP BY` is not in the `supported_type` allow-list, the entire
grouping falls back from the column-oriented `GroupValuesColumn` to the
byte-encoded `GroupValuesRows` path, **regardless of how many other columns
would have qualified for the fast path**.
This means a wide multi-column `GROUP BY` where most columns are perfectly
suited to `GroupValuesColumn` (Int / Utf8 / Date / Boolean) loses the entire
column-wise + short-circuit optimization because of a single outlier column
(e.g. a `LargeList<Struct<Utf8>>` column carried through the query for
downstream use). The fast-path-eligible columns get dragged onto the slow path.
The current trajectory in #22682 is to keep extending `supported_type`
type-by-type. That helps each new type but does not address the structural
lock-in: any future workload with a single uncommon column (Float16,
Decimal256, Duration, Interval, Union, an Arrow ExtensionType, ...) still hits
the same cliff. Every new type needs both a `supported_type` entry and a
specialized `GroupColumn` impl before any benefit is unlocked.
### Describe the solution you'd like
Introduce a generic `FallbackGroupColumn` that implements `GroupColumn` for
**any** `DataType`, backed by Arrow's existing generic builders. With this in
place, `make_group_column` (the per-field factory in `multi_group_by::mod`)
becomes:
```rust
fn make_group_column(field: &Field) -> Result<Box<dyn GroupColumn>> {
if let Some(fast) = try_specialized(field)? {
return Ok(fast); // existing primitive / byte / boolean / ...
builders
}
Ok(Box::new(FallbackGroupColumn::new(field.data_type().clone())))
}
```
`supported_type` (and the `supported_schema` gate in `new_group_values`)
flips meaning from **"which types are *allowed* in `GroupValuesColumn`"** to
**"which types have a fast specialization"**. The `GroupValuesRows` path is no
longer the fallback for unsupported types; it remains available only as an
explicit choice for workloads that benefit from row-encoded comparison.
Implementation sketch for `FallbackGroupColumn`:
- **Storage**: `arrow_select::interleave::MutableArrayData` (or equivalent
type-erased Arrow builder). `append_val(array, row)` becomes `mutable.extend(0,
row, row + 1)`, which works for every Arrow `DataType` including nested.
- **`equal_to`**: `arrow_ord::ord::make_comparator` produces a closure that
compares any two rows of two `ArrayRef`s of the same `DataType`. Slower
per-pair than a specialized impl, but **only invoked on candidate pairs that
survived the cross-column short-circuit**, which for a high-cardinality
multi-column GROUP BY is rare.
- **`vectorized_equal_to` / `vectorized_append`**: per-row loop in the first
version. Batched Arrow comparison kernels can be a later optimization.
- **`build`**: `make_array(self.storage.freeze())`.
- **`take_n`**: split the underlying Arrow array into a returned prefix and
a fresh `MutableArrayData` covering the remainder. Variable-length children
make this O(prefix + remainder) but it is correctness-bounded, not novel.
- **`size`**: sum of `MutableArrayData` allocated capacity over child
buffers.
### Expected impact
1. **Wide multi-column GROUP BYs**: no longer regress when one nested or
uncommon column is present. The cheap columns keep their column-wise +
short-circuit behavior; the outlier column pays a single recursive comparator
call only when the cheap columns all match.
2. **Future Arrow types**: any new Arrow type (extension types, Float16 once
supported, Decimal256) immediately benefits from `GroupValuesColumn` without
needing a contributor to add a new specialization. Specializations can still be
added later as opt-in optimizations on top.
3. **Maintainability**: the `supported_type` allow-list shrinks from a
correctness gate (which sometimes blocks legitimate queries from the fast path)
into a performance hint (which only blocks optimization).
4. **No regression on existing fast paths**: primitive / byte / byte-view /
boolean / decimal128 specializations still match first in `make_group_column`.
Only previously-rejected types fall through to the new builder.
### Risks and mitigations
| Risk | Mitigation |
|---|---|
| `make_comparator` performance for nested types is worse than
`arrow_row::Rows` memcmp | Mitigated by cross-column short-circuit. Benchmark
on representative shapes (single-column `List<Int>` GROUP BY where Row encoding
is currently competitive) and keep `GroupValuesRows` available for users who
explicitly want it. |
| `take_n` on variable-length child arrays involves copy on the remainder |
Already true for the specialized `List` / `Struct` builders proposed in #22682.
Same cost profile. |
| Memory accounting through `MutableArrayData` is less precise than
per-builder | Sum allocated child buffer capacities. Worst case slightly
over-reports, which is the safer direction for spill decisions. |
### Describe alternatives you've considered
- **Continue type-by-type additions (#22682)**: works but does not address
the lock-in. Each new type requires a specialized PR before any user with that
type sees benefit. Useful for hot specializations on top of the fallback
proposed here, not as a replacement.
- **Hybrid mode that only escalates to `GroupValuesRows` for the offending
column**: more code, more state to maintain, two parallel hash paradigms inside
a single intern. Generic fallback covers the same problem with much less
complexity.
### Additional context
Related to #22682, which proposes adding `List` / `LargeList` /
`FixedSizeList` / `Struct` / `Map` specializations to `supported_type`. The
generic fallback proposed here makes those specializations **optional
optimizations** rather than **prerequisites for the column-wise path**, and
addresses the broader category of "any single unsupported column drags
supported columns onto the slow path" without enumerating types one at a time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]