xiangfu0 opened a new pull request, #18662:
URL: https://github.com/apache/pinot/pull/18662

   ## Summary
   
   Adds native **`GROUP BY GROUPING SETS (...)` / `ROLLUP(...)` / `CUBE(...)`** 
and the **`GROUPING(...)` / `GROUPING_ID(...)`** functions (PostgreSQL 
semantics) to Apache Pinot's **single-stage (v1) query engine**.
   
   Previously these constructs were unsupported in the single-stage engine: the 
parser flattened `GROUP BY` into a flat list, and `PinotQuery`, `QueryContext`, 
the group-key generators, and the broker reduce path all assumed a single flat 
grouping.
   
   ```sql
   SELECT country, city, SUM(revenue), GROUPING(city)
   FROM sales
   GROUP BY ROLLUP(country, city)
   ```
   
   ## Design
   
   - A grouping-sets query is represented as the **union** of all grouping 
columns plus a list of **per-set bitmasks** (ROLLUP/CUBE expanded to grouping 
sets at parse time in `CalciteSqlParser`).
   - Each input row is expanded — in a **single scan** — into one group per 
grouping set, reusing the existing multi-value (`int[][]`) aggregation path 
(`GroupingSetsGroupKeyGenerator`).
   - A synthetic **`$grouping_id`** key column (the per-set bitmask) is 
appended after the union columns so rows from different sets never merge — e.g. 
a genuine `(a, NULL)` detail row stays distinct from a rolled-up `(a, NULL)` 
subtotal — and it powers `GROUPING()` / `GROUPING_ID()` at the broker.
   - Grouping-set key columns are serialized **null-aware regardless of the 
query's null-handling option**, since rolled-up columns are always NULL.
   - `GROUPING(args...)` is evaluated at the broker in post-aggregation by 
extracting the relevant bits from `$grouping_id` (works in SELECT, HAVING, and 
ORDER BY).
   - Multi-value grouping columns (Cartesian expansion) and filtered 
aggregations are supported. Per-segment group trim is bucketed **per grouping 
set**, so a global top-K cannot starve a low-magnitude set such as the grand 
total.
   
   ## Scope / limitations
   
   - **Single-stage engine only.** (The multi-stage engine parses grouping sets 
via Calcite but does not execute them correctly — pre-existing, out of scope 
here.)
   - Star-tree and `server.returnFinalResult` are bypassed for grouping-set 
queries; `DISTINCT` + grouping sets is rejected.
   
   ## Backward compatibility / rolling upgrade ⚠️
   
   Adds an optional `groupingSets` field to the `PinotQuery` Thrift wire 
object. A **new broker** sending a grouping-sets query to a **not-yet-upgraded 
server** fails with an actionable error (the reducer detects the missing 
`$grouping_id` column) instead of returning a silently-wrong result. **Upgrade 
servers before brokers.** A proper server-capability negotiation is left as a 
follow-up.
   
   ## Testing
   
   - **Unit:** parser expansion (ROLLUP → prefixes, CUBE → power set, GROUPING 
SETS, mixed, dedup, grand total) and column/set-count limit rejections; per-set 
bucketed trim (`TableResizerTest`).
   - **Integration (`GroupingSetsQueriesTest`, real cluster, ≥2 segments — 14 
tests):** the genuine-vs-rolled-up NULL discriminator, NULL round-trip with 
null handling on **and** off, INT/LONG/DOUBLE/STRING key types, GROUPING in 
SELECT/HAVING/ORDER BY, multi-value columns, filtered aggregations, ORDER BY on 
an aggregation, and a plain-`GROUP BY` regression.
   
   ## Follow-ups (not in this PR)
   
   - Server capability negotiation to harden the rolling-upgrade guard.
   - Dictionary-id fast path in the grouping-sets generator (performance).
   - Multi-stage engine support for grouping sets.
   
   ## Labels
   
   `feature`, `backward-incompat` (new optional Thrift field; servers must be 
upgraded before brokers), `release-notes`
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to