xiangfu0 commented on PR #18658:
URL: https://github.com/apache/pinot/pull/18658#issuecomment-4676150351

   Heads-up: found one more user-facing behavioral change in the 1.40 → 1.42 
delta that isn't covered in the behavior notes above.
   
   **[CALCITE-7052](https://issues.apache.org/jira/browse/CALCITE-7052) (1.41) 
changed `GROUP BY` identifier resolution when conformance allows group-by 
aliases** (BABEL does): `SqlValidatorImpl.ExtendedExpander` now resolves a 
simple identifier in `GROUP BY` to a **real FROM column first**, and falls back 
to the SELECT alias only when no column matches (or the match is ambiguous) — 
MySQL semantics. Before this upgrade, the SELECT alias took precedence.
   
   Net effect on the multi-stage engine: a query whose `GROUP BY` alias shadows 
a real column changes behavior. Concrete example that regressed in our e2e 
suite (the `github_events` table has a physical `id` column):
   
   ```sql
   SELECT repo_name,
          valueIn(arrayToMV(label_ids), 199293022, 204137300, 3171280082) AS id,
          count(*)
   FROM github_events
   WHERE arrayToMV(label_ids) IN (199293022, 204137300, 3171280082)
     AND arraylength(label_ids) > 1
   GROUP BY repo_name, id
   ORDER BY count(*) ASC LIMIT 5
   ```
   
   Pre-1.42 this grouped by the `valueIn(...)` alias and returned results; 
post-upgrade `GROUP BY id` binds the physical `id` column and validation fails 
with:
   
   ```
   QueryValidationError: From line 1, column 37 to line 1, column 45:
   Expression 'label_ids' is not being grouped
   ```
   
   Where the shadowed query happens to still validate (e.g. the alias 
expression only references the shadowing column), it will instead **silently 
group by the column** — a silent result change rather than an error.
   
   Worth adding to this PR's behavior/compatibility notes and the next release 
notes: queries relying on a `GROUP BY` alias that collides with a real column 
must group by the full expression or rename the alias. (Workaround is trivial 
and arguably the new behavior is the saner, standard-aligned one — this is just 
a visibility call-out, not a request to change anything.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to