[jira] [Resolved] (SPARK-43760) Incorrect attribute nullability after RewriteCorrelatedScalarSubquery leads to incorrect query results

Wenchen Fan (Jira) Tue, 30 May 2023 17:30:07 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-43760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Wenchen Fan resolved SPARK-43760.
---------------------------------
    Fix Version/s: 3.5.0
       Resolution: Fixed

Issue resolved by pull request 41287
[https://github.com/apache/spark/pull/41287]

> Incorrect attribute nullability after RewriteCorrelatedScalarSubquery leads 
> to incorrect query results
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-43760
>                 URL: https://issues.apache.org/jira/browse/SPARK-43760
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.4.0
>            Reporter: Andrey Gubichev
>            Priority: Major
>             Fix For: 3.5.0
>
>
> The following query:
>  
> {code:java}
> select * from (
>  select t1.id c1, (
>   select t2.id c from range (1, 2) t2
>   where t1.id = t2.id  ) c2
>  from range (1, 3) t1 ) t
> where t.c2 is not null
> -- !query schema
> struct<c1:bigint,c2:bigint>
> -- !query output
> 1     1
> 2     NULL
>  {code}
>  
> should return 1 row, because the second row is supposed to be removed by 
> IsNotNull predicate. However, due to a wrong nullability propagation after 
> subquery decorrelation, the output of the subquery is declared as 
> not-nullable (incorrectly), so the predicate is constant folded into True.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (SPARK-43760) Incorrect attribute nullability after RewriteCorrelatedScalarSubquery leads to incorrect query results

Reply via email to