xiangfu0 commented on PR #18662: URL: https://github.com/apache/pinot/pull/18662#issuecomment-4626960563
Addressed the review feedback in the latest revision (squashed): **Fixed** - **Empty-match bug (root cause):** `ResultsBlockUtils.buildEmptyGroupByQueryResults` now appends the synthetic `$grouping_id` column for grouping-set queries, so an empty match (e.g. fully-pruned segments) returns the same-width schema as a non-empty result. This removes the spurious "upgrade servers" error a `GROUP BY ROLLUP(...)` over an empty match could hit on a fully-upgraded cluster. Added a `testEmptyMatchRollup` integration test. - **Removed the redundant per-DataTable width guard:** with the empty path fixed, `BrokerReduceService.filterDataTablesAndPickSchema` already drops mismatched-width server responses (length-aware `Arrays.equals` on column types) and surfaces a `MERGE_RESPONSE` exception before reduce, so the extra `validateGroupingSetSchemaWidth` check was redundant; dropped it. - **Centralized the grouping-set conventions** into a single engine-agnostic owner, `org.apache.pinot.common.request.context.GroupingSets` (column name + the `GROUPING()/GROUPING_ID()` bit-math). The parser, reducer, post-aggregation handler, and table resizer now route through it, removing the duplicated bit-extraction in the two order-by/post-agg extractors. **Deliberate follow-ups (for the multi-stage-engine work, kept out of this single-stage PR to avoid destabilizing it):** - *Duplicate grouping sets* (e.g. `GROUPING SETS ((a),(a))`): the single-stage path de-dups today; matching Calcite's "preserve duplicates except under GROUP BY DISTINCT" semantics requires a per-set-instance discriminator (the bitmask alone can't distinguish identical sets), which is best landed alongside the MSE expander so both engines share one contract. - *`GROUPING` as a parser-level scalar over `$grouping_id`*: would let it flow through the existing scalar machinery uniformly (deleting the bespoke extractors) and share semantics with the planner-side MSE rewrite. The new `GroupingSets` helper is the foundation for that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
