[ 
https://issues.apache.org/jira/browse/SPARK-19712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16716212#comment-16716212
 ] 

ASF GitHub Bot commented on SPARK-19712:
----------------------------------------

dilipbiswal commented on issue #23211: [SPARK-19712][SQL] Move 
PullupCorrelatedPredicates and RewritePredicateSubquery after OptimizeSubqueries
URL: https://github.com/apache/spark/pull/23211#issuecomment-446076575
 
 
   @cloud-fan Thanks Wenchen. It makes sense. I will work in creating a smaller 
pr first.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> EXISTS and Left Semi join do not produce the same plan
> ------------------------------------------------------
>
>                 Key: SPARK-19712
>                 URL: https://issues.apache.org/jira/browse/SPARK-19712
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Nattavut Sutyanyong
>            Priority: Major
>
> This problem was found during the development of SPARK-18874.
> The EXISTS form in the following query:
> {{sql("select * from t1 inner join t2 on t1.t1a=t2.t2a where exists (select 1 
> from t3 where t1.t1b=t3.t3b)")}}
> gives the optimized plan below:
> {code}
> == Optimized Logical Plan ==
> Join Inner, (t1a#7 = t2a#25)
> :- Join LeftSemi, (t1b#8 = t3b#58)
> :  :- Filter isnotnull(t1a#7)
> :  :  +- Relation[t1a#7,t1b#8,t1c#9] parquet
> :  +- Project [1 AS 1#271, t3b#58]
> :     +- Relation[t3a#57,t3b#58,t3c#59] parquet
> +- Filter isnotnull(t2a#25)
>    +- Relation[t2a#25,t2b#26,t2c#27] parquet
> {code}
> whereas a semantically equivalent Left Semi join query below:
> {{sql("select * from t1 inner join t2 on t1.t1a=t2.t2a left semi join t3 on 
> t1.t1b=t3.t3b")}}
> gives the following optimized plan:
> {code}
> == Optimized Logical Plan ==
> Join LeftSemi, (t1b#8 = t3b#58)
> :- Join Inner, (t1a#7 = t2a#25)
> :  :- Filter (isnotnull(t1b#8) && isnotnull(t1a#7))
> :  :  +- Relation[t1a#7,t1b#8,t1c#9] parquet
> :  +- Filter isnotnull(t2a#25)
> :     +- Relation[t2a#25,t2b#26,t2c#27] parquet
> +- Project [t3b#58]
>    +- Relation[t3a#57,t3b#58,t3c#59] parquet
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to