[ 
https://issues.apache.org/jira/browse/SPARK-43780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-43780.
---------------------------------
    Fix Version/s: 4.0.0
       Resolution: Fixed

Issue resolved by pull request 41301
[https://github.com/apache/spark/pull/41301]

> Support correlated columns in join ON conditions
> ------------------------------------------------
>
>                 Key: SPARK-43780
>                 URL: https://issues.apache.org/jira/browse/SPARK-43780
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.4.0
>            Reporter: Andrey Gubichev
>            Assignee: Andrey Gubichev
>            Priority: Major
>             Fix For: 4.0.0
>
>
> Subqueries that have joins with outer references in join conditions should be 
> supported.
> For example:
>  
> {code:java}
> spark-sql (default)> create view t0(t0a) as values (0), (1), (2);
> spark-sql (default)> create view t1(t1a) as values (1), (2), (3);
> spark-sql (default)> create view t2(t2a) as values (2), (3), (4);
> spark-sql (default)> select * from t0 join lateral (select * from t1 join t2 
> on t1a = t2a and t1a = t0a);
> [UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.CORRELATED_REFERENCE] Unsupported 
> subquery expression: Expressions referencing the outer query are not 
> supported outside of WHERE/HAVING clauses: "((t1a = t2a) AND (t1a = t0a))".; 
> line 1 pos 48;
> Project [t0a#15, t1a#17, t2a#19]
> +- LateralJoin lateral-subquery#14 [t0a#15], Inner
>    :  +- SubqueryAlias __auto_generated_subquery_name
>    :     +- Project [t1a#17, t2a#19]
>    :        +- Join Inner, ((t1a#17 = t2a#19) AND (t1a#17 = outer(t0a#15)))
>    :           :- SubqueryAlias spark_catalog.default.t1
>    :           :  +- View (`spark_catalog`.`default`.`t1`, [t1a#17])
>    :           :     +- Project [cast(col1#18 as int) AS t1a#17]
>    :           :        +- LocalRelation [col1#18]
>    :           +- SubqueryAlias spark_catalog.default.t2
>    :              +- View (`spark_catalog`.`default`.`t2`, [t2a#19])
>    :                 +- Project [cast(col1#20 as int) AS t2a#19]
>    :                    +- LocalRelation [col1#20]
>    +- SubqueryAlias spark_catalog.default.t0
>       +- View (`spark_catalog`.`default`.`t0`, [t0a#15])
>          +- Project [cast(col1#16 as int) AS t0a#15]
>             +- LocalRelation [col1#16] {code}
> Here, the subquery has a join with join condition referencing the outer table 
> t0. This should be handled very similarly to how we handle correlation in 
> Filter predicates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to