Andrey Gubichev created SPARK-46468:
---------------------------------------
Summary: COUNT bug in lateral/exists subqueries
Key: SPARK-46468
URL: https://issues.apache.org/jira/browse/SPARK-46468
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 3.0.0
Reporter: Andrey Gubichev
Some further instances of a COUNT bug.
One example is this test from join-lateral.sql
[https://github.com/apache/spark/blame/master/sql/core/src/test/resources/sql-tests/results/join-lateral.sql.out#L757]
According to PostgreSQL, the query should return 2 rows:
c1 | c2 | sum
----+----+-----
0 | 1 | 2
1 | 2 | NULL
whereas Spark SQL only returns the first one.
Similar instance is the following query, which should return 1 row from t1 but
has an empty result now:
{{create temporary view t1(c1, c2) as values (0, 1), (1, 2);}}
{{create temporary view t2(c1, c2) as values (0, 2), (0, 3);}}
{{SELECT tt1.c2}}
{{FROM t1 as tt1}}
{{WHERE tt1.c1 in (}}
{{ select max(tt2.c1)}}
{{ from t2 as tt2}}
{{ where tt1.c2 is null);}}
{{}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]