xiangfu0 commented on PR #18658: URL: https://github.com/apache/pinot/pull/18658#issuecomment-4676150351
Heads-up: found one more user-facing behavioral change in the 1.40 → 1.42 delta that isn't covered in the behavior notes above. **[CALCITE-7052](https://issues.apache.org/jira/browse/CALCITE-7052) (1.41) changed `GROUP BY` identifier resolution when conformance allows group-by aliases** (BABEL does): `SqlValidatorImpl.ExtendedExpander` now resolves a simple identifier in `GROUP BY` to a **real FROM column first**, and falls back to the SELECT alias only when no column matches (or the match is ambiguous) — MySQL semantics. Before this upgrade, the SELECT alias took precedence. Net effect on the multi-stage engine: a query whose `GROUP BY` alias shadows a real column changes behavior. Concrete example that regressed in our e2e suite (the `github_events` table has a physical `id` column): ```sql SELECT repo_name, valueIn(arrayToMV(label_ids), 199293022, 204137300, 3171280082) AS id, count(*) FROM github_events WHERE arrayToMV(label_ids) IN (199293022, 204137300, 3171280082) AND arraylength(label_ids) > 1 GROUP BY repo_name, id ORDER BY count(*) ASC LIMIT 5 ``` Pre-1.42 this grouped by the `valueIn(...)` alias and returned results; post-upgrade `GROUP BY id` binds the physical `id` column and validation fails with: ``` QueryValidationError: From line 1, column 37 to line 1, column 45: Expression 'label_ids' is not being grouped ``` Where the shadowed query happens to still validate (e.g. the alias expression only references the shadowing column), it will instead **silently group by the column** — a silent result change rather than an error. Worth adding to this PR's behavior/compatibility notes and the next release notes: queries relying on a `GROUP BY` alias that collides with a real column must group by the full expression or rename the alias. (Workaround is trivial and arguably the new behavior is the saner, standard-aligned one — this is just a visibility call-out, not a request to change anything.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
