[
https://issues.apache.org/jira/browse/CALCITE-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Sysolyatin updated CALCITE-5388:
---------------------------------------
Description:
EnumerableWindow.getPartitionIterator method creates expression for creating
'tempList' array [1] and EnumerableWindow.implement method creates expression
for clearing this 'tempList'. Because state of created collection is mutable,
the collection can not be reused in other instance of EnumerableWindow but it
happens in the following use case:
{code:java}
with
CTE1(rownr1, val1) as ( select ROW_NUMBER() OVER(ORDER BY id ASC), id from
(values (1), (2)) as Vals1(id) ),
CTE2(rownr2, val2) as ( select ROW_NUMBER() OVER(ORDER BY id ASC), id from
(values (1), (2)) as Vals2(id) )
select
CTE1.rownr1,
CTE1.val1,
CTE2.rownr2,
CTE2.val2
from
CTE1,
CTE2
where
CTE1.val1 = CTE2.val2{code}
Generated plan:
{code}
EnumerableHashJoin(condition=[=($1, $3)], joinType=[inner])
EnumerableSort(sort0=[$1], dir0=[ASC])
EnumerableCalc(expr#0..1=[{inputs}], EXPR$0=[$t1], ID=[$t0])
EnumerableWindow(window#0=[window(order by [0] rows between UNBOUNDED
PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
EnumerableValues(tuples=[[{ 1 }, { 2 }]])
EnumerableSort(sort0=[$1], dir0=[ASC])
EnumerableCalc(expr#0..1=[{inputs}], EXPR$0=[$t1], ID=[$t0])
EnumerableWindow(window#0=[window(order by [0] rows between UNBOUNDED
PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
EnumerableValues(tuples=[[{ 1 }, { 2 }]])
{code}
Calcite expression optimizer tries to remove duplicate expressions and as a
result the same 'tempList' instance is used for both EnumerableWindow and the
query returns empty result instead of:
|ROWNR1|VAL1|VAL2|
|1|1|1|
|2|2|2|
[1]
https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableWindow.java#L677
[2]
https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableWindow.java#L696
[3]
https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableWindow.java#L518
was:
EnumerableWindow.getPartitionIterator method creates expression for creating
'tempList' array [1] or 'multiMap' map [2] depends on use case. And
EnumerableWindow.implement method creates expression for clearing this
collection. Because state of created collection is mutable, the collection can
not be reused in other instance of EnumerableWindow but it happens in the
following use case:
{code:java}
with
CTE1(rownr1, val1) as ( select ROW_NUMBER() OVER(ORDER BY id ASC), id from
(values (1), (2)) as Vals1(id) ),
CTE2(rownr2, val2) as ( select ROW_NUMBER() OVER(ORDER BY id ASC), id from
(values (1), (2)) as Vals2(id) )
select
CTE1.rownr1,
CTE1.val1,
CTE2.rownr2,
CTE2.val2
from
CTE1,
CTE2
where
CTE1.val1 = CTE2.val2{code}
Generated plan:
{code}
EnumerableHashJoin(condition=[=($1, $3)], joinType=[inner])
EnumerableSort(sort0=[$1], dir0=[ASC])
EnumerableCalc(expr#0..1=[{inputs}], EXPR$0=[$t1], ID=[$t0])
EnumerableWindow(window#0=[window(order by [0] rows between UNBOUNDED
PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
EnumerableValues(tuples=[[{ 1 }, { 2 }]])
EnumerableSort(sort0=[$1], dir0=[ASC])
EnumerableCalc(expr#0..1=[{inputs}], EXPR$0=[$t1], ID=[$t0])
EnumerableWindow(window#0=[window(order by [0] rows between UNBOUNDED
PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
EnumerableValues(tuples=[[{ 1 }, { 2 }]])
{code}
Calcite expression optimizer tries to remove duplicate expressions and as a
result the same 'tempList' instance is used for both EnumerableWindow and the
query returns empty result instead of:
|ROWNR1|VAL1|VAL2|
|1|1|1|
|2|2|2|
[1]
https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableWindow.java#L677
[2]
https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableWindow.java#L696
[3]
https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableWindow.java#L518
> tempList expression inside EnumerableWindow.getPartitionIterator should be
> unoptimized
> --------------------------------------------------------------------------------------
>
> Key: CALCITE-5388
> URL: https://issues.apache.org/jira/browse/CALCITE-5388
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.32.0
> Reporter: Magnus Mogren
> Assignee: Dmitry Sysolyatin
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.33.0
>
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
> EnumerableWindow.getPartitionIterator method creates expression for creating
> 'tempList' array [1] and EnumerableWindow.implement method creates expression
> for clearing this 'tempList'. Because state of created collection is mutable,
> the collection can not be reused in other instance of EnumerableWindow but it
> happens in the following use case:
> {code:java}
> with
> CTE1(rownr1, val1) as ( select ROW_NUMBER() OVER(ORDER BY id ASC), id
> from (values (1), (2)) as Vals1(id) ),
> CTE2(rownr2, val2) as ( select ROW_NUMBER() OVER(ORDER BY id ASC), id
> from (values (1), (2)) as Vals2(id) )
> select
> CTE1.rownr1,
> CTE1.val1,
> CTE2.rownr2,
> CTE2.val2
> from
> CTE1,
> CTE2
> where
> CTE1.val1 = CTE2.val2{code}
> Generated plan:
> {code}
> EnumerableHashJoin(condition=[=($1, $3)], joinType=[inner])
> EnumerableSort(sort0=[$1], dir0=[ASC])
> EnumerableCalc(expr#0..1=[{inputs}], EXPR$0=[$t1], ID=[$t0])
> EnumerableWindow(window#0=[window(order by [0] rows between UNBOUNDED
> PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
> EnumerableValues(tuples=[[{ 1 }, { 2 }]])
> EnumerableSort(sort0=[$1], dir0=[ASC])
> EnumerableCalc(expr#0..1=[{inputs}], EXPR$0=[$t1], ID=[$t0])
> EnumerableWindow(window#0=[window(order by [0] rows between UNBOUNDED
> PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
> EnumerableValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> Calcite expression optimizer tries to remove duplicate expressions and as a
> result the same 'tempList' instance is used for both EnumerableWindow and the
> query returns empty result instead of:
> |ROWNR1|VAL1|VAL2|
> |1|1|1|
> |2|2|2|
> [1]
> https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableWindow.java#L677
> [2]
> https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableWindow.java#L696
> [3]
> https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableWindow.java#L518
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)