mgaido91 commented on a change in pull request #23057: [SPARK-26078][SQL] Dedup
self-join attributes on IN subqueries
URL: https://github.com/apache/spark/pull/23057#discussion_r240619938
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
##########
@@ -92,18 +114,20 @@ object RewritePredicateSubquery extends Rule[LogicalPlan]
with PredicateHelper {
// Deduplicate conflicting attributes if any.
dedupJoin(Join(outerPlan, sub, LeftAnti, joinCond))
Review comment:
the main problem is that in the other cases, so when exists is there, the
condition is already created. So we would need to complicate quite a lot the
method in order to handle the 2 cases and I am not sure wether it is worth. For
instance, the `values`, in the `Exists` case, should be taken from the
conditions as those expressions referencing attributes from one side and the
join condition needs to be rewritten. So I don't think that it is a good idea
to have a common rewrite for both them: it would be overcomplicated IMHO.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]