Nirmal Govindaraj created CALCITE-7453:
------------------------------------------
Summary: Preserve duplicate output names through star expansion
and report resulting ambiguity correctly
Key: CALCITE-7453
URL: https://issues.apache.org/jira/browse/CALCITE-7453
Project: Calcite
Issue Type: Bug
Affects Versions: 1.42.0
Reporter: Nirmal Govindaraj
Assignee: Nirmal Govindaraj
h2. Issue
Two queries exposed the same underlying problem from opposite directions.
{code:java}
select * from (select deptno as num, empno as num from emp){code}
should work, but failed with exception
{code:java}
At line 1, column 8: Column 'NUM' is
ambiguousorg.apache.calcite.runtime.CalciteContextException: At line 1, column
8: Column 'NUM' is ambiguous at
java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
at
java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)
at
org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:511) at
org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:960) at
org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:945) at
org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:6151)
at
org.apache.calcite.sql.validate.DelegatingScope.fullyQualify(DelegatingScope.java:531)
at
org.apache.calcite.sql.validate.SqlValidatorImpl.findTableColumnPair(SqlValidatorImpl.java:4476)
at
org.apache.calcite.sql.validate.SqlValidatorImpl.isRolledUpColumn(SqlValidatorImpl.java:4523)
at
org.apache.calcite.sql.validate.SqlValidatorImpl.expandStar(SqlValidatorImpl.java:715)
at
org.apache.calcite.sql.validate.SqlValidatorImpl.expandSelectItem(SqlValidatorImpl.java:481){code}
{code:java}
select deptno from (select deptno, * from dept) -- succeeds
select deptno from (select deptno, deptno from dept) -- fails correctly{code}
should fail as ambiguous, but did not
The first query failed because Calcite lost exact field identity after star
expansion when a subquery exposed multiple columns with the same visible name.
Later validation paths such as rolled-up-column checks re-resolved the expanded
identifier by name, and that name-based lookup could bind to the wrong field.
In practice this surfaced via isRolledUp, but the underlying issue was broader:
after star expansion, two distinct source fields had collapsed to the same
visible identifier and Calcite no longer knew which one it was looking at.
The second query did not fail because star expansion was uniquifying output
aliases. That meant Calcite was not preserving the real visible row type of the
subquery. Once the duplicate column name was rewritten to a synthetic unique
name, outer resolution no longer saw an ambiguity.
These are two sides of the same problem: Calcite was not preserving duplicate
visible output names through star, and it also did not retain exact field
identity when duplicate names survived star expansion.
h2. Solution
This change fixes the problem at the root:
* Star expansion now preserves visible output names exactly, including
duplicates. For duplicate-name fields produced by star expansion, Calcite
records exact source-field metadata during validation.
* Later consumers use that exact metadata where name-based lookup is no longer
sufficient
* Ambiguous references are now reported where the preserved output row type is
genuinely ambiguous.
h2. Notes
1. EXCLUDE / EXCEPT duplicate-name semantics are not being defined by this PR
and can be discussed separately in follow-up work.
2. Existing pivot.iq coverage includes cases that intentionally generate
same-name aliases and currently fail, but after preserving duplicate output
names the expected behavior becomes less clear and raises a broader design
question about how pivot-generated aliases should behave. That discussion is
better handled in a follow-up PR.
h2. Compatibility
This will change visible output column names compared to previous Calcite
behavior
--
This message was sent by Atlassian Jira
(v8.20.10#820010)