[ 
https://issues.apache.org/jira/browse/CALCITE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nirmal Govindaraj updated CALCITE-7453:
---------------------------------------
    Summary: Preserve duplicate star-expanded names for ambiguity checks  (was: 
Preserve duplicate output names through star expansion and report resulting 
ambiguity correctly)

> Preserve duplicate star-expanded names for ambiguity checks
> -----------------------------------------------------------
>
>                 Key: CALCITE-7453
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7453
>             Project: Calcite
>          Issue Type: Bug
>    Affects Versions: 1.42.0
>            Reporter: Nirmal Govindaraj
>            Assignee: Nirmal Govindaraj
>            Priority: Major
>              Labels: pull-request-available
>
> h2. Issue
> Two queries exposed the same underlying problem from opposite directions.
> {code:java}
> select * from (select deptno as num, empno as num from emp){code}
> should work, but failed with exception
> {code:java}
> At line 1, column 8: Column 'NUM' is 
> ambiguousorg.apache.calcite.runtime.CalciteContextException: At line 1, 
> column 8: Column 'NUM' is ambiguous      at 
> java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
>   at 
> java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
>   at 
> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)    
> at 
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:511)  
> at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:960) at 
> org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:945) at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:6151)
>       at 
> org.apache.calcite.sql.validate.DelegatingScope.fullyQualify(DelegatingScope.java:531)
>        at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.findTableColumnPair(SqlValidatorImpl.java:4476)
>      at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.isRolledUpColumn(SqlValidatorImpl.java:4523)
>         at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.expandStar(SqlValidatorImpl.java:715)
>        at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.expandSelectItem(SqlValidatorImpl.java:481)
>   {code}
> {code:java}
> select deptno from (select deptno, * from dept){code}
> should fail as ambiguous, but did not
> The first query failed because Calcite lost exact field identity after star 
> expansion when a subquery exposed multiple columns with the same visible 
> name. Later validation paths such as rolled-up-column checks re-resolved the 
> expanded identifier by name, and that name-based lookup could bind to the 
> wrong field. In practice this surfaced via isRolledUp, but the underlying 
> issue was broader: after star expansion, two distinct source fields had 
> collapsed to the same visible identifier and Calcite no longer knew which one 
> it was looking at.
> The second query did not fail because star expansion was uniquifying output 
> aliases. That meant Calcite was not preserving the real visible row type of 
> the subquery. Once the duplicate column name was rewritten to a synthetic 
> unique name, outer resolution no longer saw an ambiguity.
> These are two sides of the same problem: Calcite was not preserving duplicate 
> visible output names through star, and it also did not retain exact field 
> identity when duplicate names survived star expansion.
> h2. Solution
> This change fixes the problem at the root:
>  * Star expansion now preserves visible output names exactly, including 
> duplicates. For duplicate-name fields produced by star expansion, Calcite 
> records exact source-field metadata during validation.
>  * Later consumers use that exact metadata where name-based lookup is no 
> longer sufficient
>  * Ambiguous references are now reported where the preserved output row type 
> is genuinely ambiguous.
> h2. Notes
>  #  EXCLUDE / EXCEPT duplicate-name semantics are not being defined by this 
> PR and can be discussed separately in follow-up work.
>  # Existing pivot.iq coverage includes cases that intentionally generate 
> same-name aliases and currently fail, but after preserving duplicate output 
> names the expected behavior becomes less clear and raises a broader design 
> question about how pivot-generated aliases should behave. That discussion is 
> better handled in a follow-up PR.
> h2. Compatibility
> This will change visible output column names compared to previous Calcite 
> behavior



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to