[jira] [Comment Edited] (CALCITE-7390) Types generated for Window can contain multiple fields with the same name

lincoln lee (Jira) Thu, 22 Jan 2026 04:20:09 -0800


    [ 
https://issues.apache.org/jira/browse/CALCITE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18053537#comment-18053537
 ]


lincoln lee edited comment on CALCITE-7390 at 1/22/26 12:19 PM:
----------------------------------------------------------------

I think I understand the issue you’re referring to now.
Given a table T(v BIGINT) and the query:
{code:java}
SELECT COUNT(*) OVER (ORDER BY v) v FROM T
{code}
since a window operator itself does not project columns, it always appends 
newly computed results to the input schema. As a result, when preserving 
original field names, we may encounter a situation similar to name conflicts in 
joins. However, this does not directly affect downstream correctness, because 
downstream operators access columns by index rather than by name.

For example, in:
{code:java}
SELECT * FROM T t1 JOIN T t2 ON t1.v = t2.v
{code}
the final SELECT is effectively handled as (v, v0).

Therefore, it might be more appropriate to use:
{code:java}
SELECT *, COUNT(*) OVER (ORDER BY v) v FROM T
{code}
as the example, because all columns are included in the final output and there 
is no additional projection. In theory, the same situation applies to joins as 
well (although, due to the change in the ProjectRemoveRule in 
https://issues.apache.org/jira/browse/CALCITE-6850, where matching {{anyInput}} 
was changed to {{{}oneInput{}}}).

Therefore, if we decide to make the name change, should we unify this kind of 
behavior from a global perspective not limited to window node?

What do you think?


was (Author: lincoln.86xy):
I think I understand the issue you’re referring to now.
Given a table T(v BIGINT) and the query:
{code}
SELECT COUNT(*) OVER (ORDER BY v) v FROM T
{code}
since a window operator itself does not project columns, it always appends 
newly computed results to the input schema. As a result, when preserving 
original field names, we may encounter a situation similar to name conflicts in 
joins. However, this does not directly affect downstream correctness, because 
downstream operators access columns by index rather than by name.

For example, in:
{code}
SELECT * FROM T t1 JOIN T t2 ON t1.v = t2.v
{code}

the final SELECT is effectively handled as (v, v0).

Therefore, it might be more appropriate to use:
{code}
SELECT *, COUNT(*) OVER (ORDER BY v) v FROM T
{code}

as the example. In this case, since all columns are included in the final 
output and there is no additional projection, we need to resolve column name 
conflicts inside the window node itself, following the projection naming rules.

What do you think?

> Types generated for Window can contain multiple fields with the same name
> -------------------------------------------------------------------------
>
>                 Key: CALCITE-7390
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7390
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.42
>            Reporter: Mihai Budiu
>            Priority: Minor
>
> As a result of CALCITE-7375 some LogicalWindow operations can have types with 
> duplicate names. 
> [~lincoln.86xy] you have introduced this change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (CALCITE-7390) Types generated for Window can contain multiple fields with the same name

Reply via email to