zhuqi-lucas commented on PR #22706: URL: https://github.com/apache/datafusion/pull/22706#issuecomment-4602560085
Thanks for taking a look. Answers below. ## Q1: source of "most common offender" + supported-types table Yes, from production observation. The shape is a wide multi-column GROUP BY (~40 columns) on a SEC filings materialized view, where a single `LargeList<Struct<Utf8, LargeUtf8>>` column for filing notes was causing the entire grouping to fall back to `GroupValuesRows`. Without that column the grouping completes at ~15 GiB peak; with it the path switches to row encoding and peak grows past 25 GiB. The PR's memory regression tests synthesize that shape and show 2.2x to 3.9x savings on smaller versions of the same pattern. Below is the multi-column GROUP BY supported set after this PR. (Single-column GROUP BY has separate specializations in `single_group_by/` that this PR does not touch.) **Supported** - Int8..Int64, UInt8..UInt64, Float32, Float64 - Decimal128 - Utf8, LargeUtf8, Utf8View - Binary, LargeBinary, BinaryView - Boolean - Date32, Date64 - Time32(Second / Millisecond) - Time64(Microsecond / Nanosecond). This PR closes a pre-existing inconsistency on main where the dispatcher accepted these but `supported_type` did not. Copilot review caught it. - Timestamp(any unit) - **NEW: FixedSizeList\<primitive\>** (primitive child only) - **NEW: List\<T\> / LargeList\<T\>** (recursive: child must be a supported type) - **NEW: Struct\<...\>** (recursive: all fields must be supported types) **Still falls back to `GroupValuesRows`** - Float16, Decimal256 - FixedSizeList with non-primitive child - Map\<K, V\> (entry-ordering semantics deserve their own discussion) - Union, Dictionary, FixedSizeBinary - Interval, Duration The PR pins the `supported_type` ↔ `make_group_column` biconditional as a unit-test fuzz (`supported_type_and_make_group_column_stay_in_sync`) over 21 supported + 12 unsupported cases, so future drift between the allow-list and the dispatcher trips immediately. ## Q2: split into smaller PRs Happy to. Proposed stacked sequence: 1. **Refactor only.** Extract the inline match in `GroupValuesColumn::intern` into a recursive `make_group_column(field: &Field) -> Result<Box<dyn GroupColumn>>` factory and make `supported_type` recursive (the recursion is needed up front so subsequent PRs can plug in nested children cleanly). No new builders. Also includes the Time32 / Time64 alignment fix and the consistency fuzz so the invariant is locked from the start. 2. **FixedSizeList\<primitive\>** support. Smallest new builder, no recursive child handling needed. 3. **Struct\<...\>** support. Recursive child via PR 1's factory. 4. **List\<T\> and LargeList\<T\>** support. Recursive child via PR 1's factory. 5. **Memory regression tests + composition coverage.** `LargeList<Struct>`, `List<List>`, `Struct<Struct>` end-to-end, plus the GroupValuesColumn-vs-GroupValuesRows size assertions that pin the 2.2x to 3.9x savings. I'll prototype the split as stacked commits on this branch first so you can sanity-check the boundaries before I open the actual sequence of PRs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
