allisonwang-db commented on pull request #32303:
URL: https://github.com/apache/spark/pull/32303#issuecomment-854249253
@maropu @cloud-fan @viirya
I updated the logical plan structure to handle a few cases I found during
the test that do not work with the original plan representation. Originally, we
directly use the logical Join node with a new join type `LateralJoin`. This
structure can't handle the case when deduping relations, for instance:
```scala
'Join
:- LocalRelation [a, b]
+- Join LateralJoin(Inner)
:- LocalRelation [a, b] <--- after DeduplicateRelation: LocalRelation
['a, 'b]
+- Project [outer(a) + outer(b)] <--- but outer references won't be
updated
+- OneRowRelation
```
`DeduplicateRelation` rule will return a new instance of the
`LocalRelation[a, b]`, but the current `transformUpWithNewOutput` won't be able
to update the existing outer references in the right subtree. Also, the
resolution logic has a lot of overlaps with `ResolveSubquery`. The new logic
unifies the analysis logic for lateral subqueries and existing subqueries. It
adds a new unary node `LateralJoin` and a new subquery expression
`LateralSubquery` to represent a lateral subquery. I've updated the PR
description.
Also, I removed the logic to resolve star expressions in subqueries. It is
not lateral subquery specific and I've created a new JIRA ticket: SPARK-35618.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]