mgaido91 commented on a change in pull request #23057: [SPARK-26078][SQL] Dedup 
self-join attributes on IN subqueries
URL: https://github.com/apache/spark/pull/23057#discussion_r240619938
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
 ##########
 @@ -92,18 +114,20 @@ object RewritePredicateSubquery extends Rule[LogicalPlan] 
with PredicateHelper {
           // Deduplicate conflicting attributes if any.
           dedupJoin(Join(outerPlan, sub, LeftAnti, joinCond))
 
 Review comment:
   the main problem is that in the other cases, so when exists is there, the 
condition is already created. So we would need to complicate quite a lot the 
method in order to handle the 2 cases and I am not sure wether it is worth. For 
instance, the `values`, in the `Exists` case, should be taken from the 
conditions as those expressions referencing attributes from one side and the 
join condition needs to be rewritten. So I don't think that it is a good idea 
to have a common rewrite for both them: it would be overcomplicated IMHO.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to