[
https://issues.apache.org/jira/browse/CALCITE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18053537#comment-18053537
]
lincoln lee edited comment on CALCITE-7390 at 1/22/26 12:19 PM:
----------------------------------------------------------------
I think I understand the issue you’re referring to now.
Given a table T(v BIGINT) and the query:
{code:java}
SELECT COUNT(*) OVER (ORDER BY v) v FROM T
{code}
since a window operator itself does not project columns, it always appends
newly computed results to the input schema. As a result, when preserving
original field names, we may encounter a situation similar to name conflicts in
joins. However, this does not directly affect downstream correctness, because
downstream operators access columns by index rather than by name.
For example, in:
{code:java}
SELECT * FROM T t1 JOIN T t2 ON t1.v = t2.v
{code}
the final SELECT is effectively handled as (v, v0).
Therefore, it might be more appropriate to use:
{code:java}
SELECT *, COUNT(*) OVER (ORDER BY v) v FROM T
{code}
as the example, because all columns are included in the final output and there
is no additional projection. In theory, the same situation applies to joins as
well (although, due to the change in the ProjectRemoveRule in
https://issues.apache.org/jira/browse/CALCITE-6850, where matching {{anyInput}}
was changed to {{{}oneInput{}}}).
Therefore, if we decide to make the name change, should we unify this kind of
behavior from a global perspective not limited to window node?
What do you think?
was (Author: lincoln.86xy):
I think I understand the issue you’re referring to now.
Given a table T(v BIGINT) and the query:
{code}
SELECT COUNT(*) OVER (ORDER BY v) v FROM T
{code}
since a window operator itself does not project columns, it always appends
newly computed results to the input schema. As a result, when preserving
original field names, we may encounter a situation similar to name conflicts in
joins. However, this does not directly affect downstream correctness, because
downstream operators access columns by index rather than by name.
For example, in:
{code}
SELECT * FROM T t1 JOIN T t2 ON t1.v = t2.v
{code}
the final SELECT is effectively handled as (v, v0).
Therefore, it might be more appropriate to use:
{code}
SELECT *, COUNT(*) OVER (ORDER BY v) v FROM T
{code}
as the example. In this case, since all columns are included in the final
output and there is no additional projection, we need to resolve column name
conflicts inside the window node itself, following the projection naming rules.
What do you think?
> Types generated for Window can contain multiple fields with the same name
> -------------------------------------------------------------------------
>
> Key: CALCITE-7390
> URL: https://issues.apache.org/jira/browse/CALCITE-7390
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.42
> Reporter: Mihai Budiu
> Priority: Minor
>
> As a result of CALCITE-7375 some LogicalWindow operations can have types with
> duplicate names.
> [~lincoln.86xy] you have introduced this change.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)