allisonwang-db commented on pull request #32303:
URL: https://github.com/apache/spark/pull/32303#issuecomment-854249253


   @maropu @cloud-fan @viirya 
   I updated the logical plan structure to handle a few cases I found during 
the test that do not work with the original plan representation. Originally, we 
directly use the logical Join node with a new join type `LateralJoin`. This 
structure can't handle the case when deduping relations, for instance:
   ```scala
   'Join
   :- LocalRelation [a, b]
   +- Join LateralJoin(Inner)
      :- LocalRelation [a, b]  <--- after DeduplicateRelation: LocalRelation 
['a, 'b]
      +- Project [outer(a) + outer(b)]  <--- but outer references won't be 
updated
         +- OneRowRelation
   ```
   `DeduplicateRelation` rule will return a new instance of the 
`LocalRelation[a, b]`, but the current `transformUpWithNewOutput` won't be able 
to update the existing outer references in the right subtree. Also, the 
resolution logic has a lot of overlaps with `ResolveSubquery`. The new logic 
unifies the analysis logic for lateral subqueries and existing subqueries. It 
adds a new unary node `LateralJoin` and a new subquery expression 
`LateralSubquery` to represent a lateral subquery. I've updated the PR 
description. 
   
   Also, I removed the logic to resolve star expressions in subqueries. It is 
not lateral subquery specific and I've created a new JIRA ticket: SPARK-35618. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to