Andrey Gubichev created SPARK-43760:
---------------------------------------
Summary: Incorrect attribute nullability after
RewriteCorrelatedScalarSubquery leads to incorrect query results
Key: SPARK-43760
URL: https://issues.apache.org/jira/browse/SPARK-43760
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 3.4.0
Reporter: Andrey Gubichev
The following query:
{{select *}}
{{from range(1, 3) t1}}
{{where (select sum(c) from (}}
{{ select t2.id * t2.id c}}
{{ from range (1, 2) t2 where t1.id = t2.id}}
{{ group by t2.id}}
{{ )}}
{{) is not null;}}
should return 1 row, because the second row is supposed to be removed by
IsNotNull predicate. However, due to a wrong nullability propagation after
subquery decorrelation, the output of the subquery is declared as not-nullable
(incorrectly), so the predicate is constant folded into True.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]