cloud-fan commented on PR #37074:
URL: https://github.com/apache/spark/pull/37074#issuecomment-1181935798

   After more thoughts, I think we should treat correlated subquery as a join 
in optimizer rules. So in this case, once we remove the `Project`, the plan 
becomes invalid, because the subquery's outer reference, which will be in the 
join condition, becomes ambiguous.
   
   I think your inital approach is the right direction. But let's make it more 
precise. We should only keep the `Project`, if:
   1. the filter condition contains correlated subqueries
   2. the subquery's outer references exists in both join sides if we remove 
the project.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to