[
https://issues.apache.org/jira/browse/CALCITE-7453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nirmal Govindaraj updated CALCITE-7453:
---------------------------------------
External issue URL: https://github.com/apache/calcite/pull/4847
> Preserve duplicate output names through star expansion and report resulting
> ambiguity correctly
> -----------------------------------------------------------------------------------------------
>
> Key: CALCITE-7453
> URL: https://issues.apache.org/jira/browse/CALCITE-7453
> Project: Calcite
> Issue Type: Bug
> Affects Versions: 1.42.0
> Reporter: Nirmal Govindaraj
> Assignee: Nirmal Govindaraj
> Priority: Major
> Labels: pull-request-available
>
> h2. Issue
> Two queries exposed the same underlying problem from opposite directions.
> {code:java}
> select * from (select deptno as num, empno as num from emp){code}
> should work, but failed with exception
> {code:java}
> At line 1, column 8: Column 'NUM' is
> ambiguousorg.apache.calcite.runtime.CalciteContextException: At line 1,
> column 8: Column 'NUM' is ambiguous at
> java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
> at
> java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
> at
> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)
> at
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:511)
> at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:960) at
> org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:945) at
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:6151)
> at
> org.apache.calcite.sql.validate.DelegatingScope.fullyQualify(DelegatingScope.java:531)
> at
> org.apache.calcite.sql.validate.SqlValidatorImpl.findTableColumnPair(SqlValidatorImpl.java:4476)
> at
> org.apache.calcite.sql.validate.SqlValidatorImpl.isRolledUpColumn(SqlValidatorImpl.java:4523)
> at
> org.apache.calcite.sql.validate.SqlValidatorImpl.expandStar(SqlValidatorImpl.java:715)
> at
> org.apache.calcite.sql.validate.SqlValidatorImpl.expandSelectItem(SqlValidatorImpl.java:481)
> {code}
> {code:java}
> select deptno from (select deptno, * from dept) -- succeeds
> select deptno from (select deptno, deptno from dept) -- fails correctly{code}
> should fail as ambiguous, but did not
> The first query failed because Calcite lost exact field identity after star
> expansion when a subquery exposed multiple columns with the same visible
> name. Later validation paths such as rolled-up-column checks re-resolved the
> expanded identifier by name, and that name-based lookup could bind to the
> wrong field. In practice this surfaced via isRolledUp, but the underlying
> issue was broader: after star expansion, two distinct source fields had
> collapsed to the same visible identifier and Calcite no longer knew which one
> it was looking at.
> The second query did not fail because star expansion was uniquifying output
> aliases. That meant Calcite was not preserving the real visible row type of
> the subquery. Once the duplicate column name was rewritten to a synthetic
> unique name, outer resolution no longer saw an ambiguity.
> These are two sides of the same problem: Calcite was not preserving duplicate
> visible output names through star, and it also did not retain exact field
> identity when duplicate names survived star expansion.
> h2. Solution
> This change fixes the problem at the root:
> * Star expansion now preserves visible output names exactly, including
> duplicates. For duplicate-name fields produced by star expansion, Calcite
> records exact source-field metadata during validation.
> * Later consumers use that exact metadata where name-based lookup is no
> longer sufficient
> * Ambiguous references are now reported where the preserved output row type
> is genuinely ambiguous.
> h2. Notes
> 1. EXCLUDE / EXCEPT duplicate-name semantics are not being defined by this
> PR and can be discussed separately in follow-up work.
> 2. Existing pivot.iq coverage includes cases that intentionally generate
> same-name aliases and currently fail, but after preserving duplicate output
> names the expected behavior becomes less clear and raises a broader design
> question about how pivot-generated aliases should behave. That discussion is
> better handled in a follow-up PR.
> h2. Compatibility
> This will change visible output column names compared to previous Calcite
> behavior
--
This message was sent by Atlassian Jira
(v8.20.10#820010)