[ 
https://issues.apache.org/jira/browse/SPARK-46468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Gubichev updated SPARK-46468:
------------------------------------
    Description: 
Some further instances of a COUNT bug.

 

One example is this test from join-lateral.sql

[https://github.com/apache/spark/blame/master/sql/core/src/test/resources/sql-tests/results/join-lateral.sql.out#L757]

 

According to PostgreSQL, the query should return 2 rows:

c1 | c2 | sum

---{-}++{-}--{-}{-}----

  0 |  1 |   2

  1 |  2 |    NULL

 

whereas Spark SQL only returns the first one.

 

Similar instance is the following query, which should return 1 row from t1 but 
has an empty result now:

{{create temporary view t1(c1, c2) as values (0, 1), (1, 2);}}
{{create temporary view t2(c1, c2) as values (0, 2), (0, 3);}}

{{SELECT tt1.c2}}
{{FROM t1 as tt1}}
{{WHERE tt1.c1 in (}}
select max(tt2.c1)
from t2 as tt2
 where tt1.c2 is null);

  was:
Some further instances of a COUNT bug.

 

One example is this test from join-lateral.sql

[https://github.com/apache/spark/blame/master/sql/core/src/test/resources/sql-tests/results/join-lateral.sql.out#L757]

 

According to PostgreSQL, the query should return 2 rows:

 c1 | c2 | sum 

----+----+-----

  0 |  1 |   2

  1 |  2 |    NULL

 

whereas Spark SQL only returns the first one.

 

Similar instance is the following query, which should return 1 row from t1 but 
has an empty result now:

{{create temporary view t1(c1, c2) as values (0, 1), (1, 2);}}
{{create temporary view t2(c1, c2) as values (0, 2), (0, 3);}}


{{SELECT tt1.c2}}
{{FROM t1 as tt1}}
{{WHERE tt1.c1 in (}}
{{  select max(tt2.c1)}}
{{  from t2 as tt2}}
{{  where tt1.c2 is null);}}

{{}}


> COUNT bug in lateral/exists subqueries
> --------------------------------------
>
>                 Key: SPARK-46468
>                 URL: https://issues.apache.org/jira/browse/SPARK-46468
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Andrey Gubichev
>            Priority: Major
>
> Some further instances of a COUNT bug.
>  
> One example is this test from join-lateral.sql
> [https://github.com/apache/spark/blame/master/sql/core/src/test/resources/sql-tests/results/join-lateral.sql.out#L757]
>  
> According to PostgreSQL, the query should return 2 rows:
> c1 | c2 | sum
> ---{-}++{-}--{-}{-}----
>   0 |  1 |   2
>   1 |  2 |    NULL
>  
> whereas Spark SQL only returns the first one.
>  
> Similar instance is the following query, which should return 1 row from t1 
> but has an empty result now:
> {{create temporary view t1(c1, c2) as values (0, 1), (1, 2);}}
> {{create temporary view t2(c1, c2) as values (0, 2), (0, 3);}}
> {{SELECT tt1.c2}}
> {{FROM t1 as tt1}}
> {{WHERE tt1.c1 in (}}
> select max(tt2.c1)
> from t2 as tt2
>  where tt1.c2 is null);



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to