[
https://issues.apache.org/jira/browse/SPARK-43780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan reassigned SPARK-43780:
-----------------------------------
Assignee: Andrey Gubichev
> Support correlated columns in join ON conditions
> ------------------------------------------------
>
> Key: SPARK-43780
> URL: https://issues.apache.org/jira/browse/SPARK-43780
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.4.0
> Reporter: Andrey Gubichev
> Assignee: Andrey Gubichev
> Priority: Major
>
> Subqueries that have joins with outer references in join conditions should be
> supported.
> For example:
>
> {code:java}
> spark-sql (default)> create view t0(t0a) as values (0), (1), (2);
> spark-sql (default)> create view t1(t1a) as values (1), (2), (3);
> spark-sql (default)> create view t2(t2a) as values (2), (3), (4);
> spark-sql (default)> select * from t0 join lateral (select * from t1 join t2
> on t1a = t2a and t1a = t0a);
> [UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.CORRELATED_REFERENCE] Unsupported
> subquery expression: Expressions referencing the outer query are not
> supported outside of WHERE/HAVING clauses: "((t1a = t2a) AND (t1a = t0a))".;
> line 1 pos 48;
> Project [t0a#15, t1a#17, t2a#19]
> +- LateralJoin lateral-subquery#14 [t0a#15], Inner
> : +- SubqueryAlias __auto_generated_subquery_name
> : +- Project [t1a#17, t2a#19]
> : +- Join Inner, ((t1a#17 = t2a#19) AND (t1a#17 = outer(t0a#15)))
> : :- SubqueryAlias spark_catalog.default.t1
> : : +- View (`spark_catalog`.`default`.`t1`, [t1a#17])
> : : +- Project [cast(col1#18 as int) AS t1a#17]
> : : +- LocalRelation [col1#18]
> : +- SubqueryAlias spark_catalog.default.t2
> : +- View (`spark_catalog`.`default`.`t2`, [t2a#19])
> : +- Project [cast(col1#20 as int) AS t2a#19]
> : +- LocalRelation [col1#20]
> +- SubqueryAlias spark_catalog.default.t0
> +- View (`spark_catalog`.`default`.`t0`, [t0a#15])
> +- Project [cast(col1#16 as int) AS t0a#15]
> +- LocalRelation [col1#16] {code}
> Here, the subquery has a join with join condition referencing the outer table
> t0. This should be handled very similarly to how we handle correlation in
> Filter predicates.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]