[ 
https://issues.apache.org/jira/browse/SPARK-44448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Chen updated SPARK-44448:
------------------------------
    Description: 
Top-k filters on a dense_rank() window function return wrong results, due to a 
bug in optimization InferWindowGroupLimit, specifically in the code for 
DenseRankLimitIterator, introduced in 
https://issues.apache.org/jira/browse/SPARK-37099.

Repro:
{code:java}
create or replace temp view t1 (p, o) as values (1, 1), (1, 1), (1, 2), (2, 1), 
(2, 1), (2, 2);

select * from (select *, dense_rank() over (partition by p order by o) as rnk 
from t1) where rnk = 1;{code}
Spark result:
{code:java}
[1,1,1]
[1,1,1]
[2,1,1]{code}
Correct result:
{code:java}
[1,1,1]
[1,1,1]
[2,1,1]
[2,1,1]{code}
 

The bug is in {{{}DenseRankLimitIterator{}}}, it fails to reset state properly 
when transitioning from one window partition to the next. {{reset}} only resets 
{{{}rank = 0{}}}, what it is missing is to reset {{{}currentRankRow = null{}}}. 
This means that when processing the second and later window partitions, the 
rank incorrectly gets incremented based on comparing the ordering of the last 
row of the previous partition to the first row of the new partition.

This means that a dense_rank window func that has more than one window 
partition and more than one row with dense_rank = 1 in the second or later 
partitions can give wrong results when optimized.

({{{}RankLimitIterator{}}} narrowly avoids this bug by happenstance, the first 
row in the new partition will try to increment rank, but increment it by the 
value of count which is 0, so it happens to work by accident).

Unfortunately, tests for the optimization only had a single row per rank, so 
did not catch the bug as the bug requires multiple rows per rank.

  was:
Top-k filters on a dense_rank() window function return wrong results, due to a 
bug in optimization InferWindowGroupLimit, specifically in the code for 
DenseRankLimitIterator.

Repro:
{code:java}
create or replace temp view t1 (p, o) as values (1, 1), (1, 1), (1, 2), (2, 1), 
(2, 1), (2, 2);

select * from (select *, dense_rank() over (partition by p order by o) as rnk 
from t1) where rnk = 1;{code}
Spark result:
{code:java}
[1,1,1]
[1,1,1]
[2,1,1]{code}
Correct result:
{code:java}
[1,1,1]
[1,1,1]
[2,1,1]
[2,1,1]{code}
 

The bug is in {{{}DenseRankLimitIterator{}}}, it fails to reset state properly 
when transitioning from one window partition to the next. {{reset}} only resets 
{{{}rank = 0{}}}, what it is missing is to reset {{{}currentRankRow = null{}}}. 
This means that when processing the second and later window partitions, the 
rank incorrectly gets incremented based on comparing the ordering of the last 
row of the previous partition to the first row of the new partition.

This means that a dense_rank window func that has more than one window 
partition and more than one row with dense_rank = 1 in the second or later 
partitions can give wrong results when optimized.

({{{}RankLimitIterator{}}} narrowly avoids this bug by happenstance, the first 
row in the new partition will try to increment rank, but increment it by the 
value of count which is 0, so it happens to work by accident).

Unfortunately, tests for the optimization only had a single row per rank, so 
did not catch the bug as the bug requires multiple rows per rank.


> Wrong results for dense_rank() <= k from InferWindowGroupLimit and 
> DenseRankLimitIterator
> -----------------------------------------------------------------------------------------
>
>                 Key: SPARK-44448
>                 URL: https://issues.apache.org/jira/browse/SPARK-44448
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.4.0
>            Reporter: Jack Chen
>            Priority: Major
>
> Top-k filters on a dense_rank() window function return wrong results, due to 
> a bug in optimization InferWindowGroupLimit, specifically in the code for 
> DenseRankLimitIterator, introduced in 
> https://issues.apache.org/jira/browse/SPARK-37099.
> Repro:
> {code:java}
> create or replace temp view t1 (p, o) as values (1, 1), (1, 1), (1, 2), (2, 
> 1), (2, 1), (2, 2);
> select * from (select *, dense_rank() over (partition by p order by o) as rnk 
> from t1) where rnk = 1;{code}
> Spark result:
> {code:java}
> [1,1,1]
> [1,1,1]
> [2,1,1]{code}
> Correct result:
> {code:java}
> [1,1,1]
> [1,1,1]
> [2,1,1]
> [2,1,1]{code}
>  
> The bug is in {{{}DenseRankLimitIterator{}}}, it fails to reset state 
> properly when transitioning from one window partition to the next. {{reset}} 
> only resets {{{}rank = 0{}}}, what it is missing is to reset 
> {{{}currentRankRow = null{}}}. This means that when processing the second and 
> later window partitions, the rank incorrectly gets incremented based on 
> comparing the ordering of the last row of the previous partition to the first 
> row of the new partition.
> This means that a dense_rank window func that has more than one window 
> partition and more than one row with dense_rank = 1 in the second or later 
> partitions can give wrong results when optimized.
> ({{{}RankLimitIterator{}}} narrowly avoids this bug by happenstance, the 
> first row in the new partition will try to increment rank, but increment it 
> by the value of count which is 0, so it happens to work by accident).
> Unfortunately, tests for the optimization only had a single row per rank, so 
> did not catch the bug as the bug requires multiple rows per rank.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to