xiangfu0 commented on PR #18662:
URL: https://github.com/apache/pinot/pull/18662#issuecomment-4626960563

   Addressed the review feedback in the latest revision (squashed):
   
   **Fixed**
   - **Empty-match bug (root cause):** 
`ResultsBlockUtils.buildEmptyGroupByQueryResults` now appends the synthetic 
`$grouping_id` column for grouping-set queries, so an empty match (e.g. 
fully-pruned segments) returns the same-width schema as a non-empty result. 
This removes the spurious "upgrade servers" error a `GROUP BY ROLLUP(...)` over 
an empty match could hit on a fully-upgraded cluster. Added a 
`testEmptyMatchRollup` integration test.
   - **Removed the redundant per-DataTable width guard:** with the empty path 
fixed, `BrokerReduceService.filterDataTablesAndPickSchema` already drops 
mismatched-width server responses (length-aware `Arrays.equals` on column 
types) and surfaces a `MERGE_RESPONSE` exception before reduce, so the extra 
`validateGroupingSetSchemaWidth` check was redundant; dropped it.
   - **Centralized the grouping-set conventions** into a single engine-agnostic 
owner, `org.apache.pinot.common.request.context.GroupingSets` (column name + 
the `GROUPING()/GROUPING_ID()` bit-math). The parser, reducer, post-aggregation 
handler, and table resizer now route through it, removing the duplicated 
bit-extraction in the two order-by/post-agg extractors.
   
   **Deliberate follow-ups (for the multi-stage-engine work, kept out of this 
single-stage PR to avoid destabilizing it):**
   - *Duplicate grouping sets* (e.g. `GROUPING SETS ((a),(a))`): the 
single-stage path de-dups today; matching Calcite's "preserve duplicates except 
under GROUP BY DISTINCT" semantics requires a per-set-instance discriminator 
(the bitmask alone can't distinguish identical sets), which is best landed 
alongside the MSE expander so both engines share one contract.
   - *`GROUPING` as a parser-level scalar over `$grouping_id`*: would let it 
flow through the existing scalar machinery uniformly (deleting the bespoke 
extractors) and share semantics with the planner-side MSE rewrite. The new 
`GroupingSets` helper is the foundation for that.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to