[
https://issues.apache.org/jira/browse/SPARK-35080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17321829#comment-17321829
]
Apache Spark commented on SPARK-35080:
--------------------------------------
User 'allisonwang-db' has created a pull request for this issue:
https://github.com/apache/spark/pull/32179
> Correlated subqueries with equality predicates can return wrong results
> -----------------------------------------------------------------------
>
> Key: SPARK-35080
> URL: https://issues.apache.org/jira/browse/SPARK-35080
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.2.0
> Reporter: Allison Wang
> Priority: Major
>
> Correlated subqueries with aggregate that pass CheckAnalysis (with only
> correlated equality predicates) can still return wrong results. This is
> because equality predicates do not guarantee one-to-one mappings between
> inner and outer attributes, and the semantics of the plan will be changed
> when the inner attributes are pulled up through an Aggregate, which gives us
> wrong results. Currently, the decorrelation framework does not support these
> types of correlated subqueries, and they should be blocked in CheckAnalysis.
> Example 1:
> {code:sql}
> create or replace view t1(c) as values ('a'), ('b')
> create or replace view t2(c) as values ('ab'), ('abc'), ('bc')
> select c, (select count(*) from t2 where t1.c = substring(t2.c, 1, 1)) from t1
> {code}
> Correct results: [(a, 2), (b, 1)]
> Spark results:
> {code:java}
> +---+-----------------+
> |c |scalarsubquery(c)|
> +---+-----------------+
> |a |1 |
> |a |1 |
> |b |1 |
> +---+-----------------+{code}
> Example 2:
> {code:sql}
> create or replace view t1(a, b) as values (0, 6), (1, 5), (2, 4), (3, 3);
> create or replace view t2(c) as values (6);
> select c, (select count(*) from t1 where a + b = c) from t2;{code}
> Correct results: [(6, 4)]
> Spark results:
> {code:java}
> +---+-----------------+
> |c |scalarsubquery(c)|
> +---+-----------------+
> |6 |1 |
> |6 |1 |
> |6 |1 |
> |6 |1 |
> +---+-----------------+
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]