[
https://issues.apache.org/jira/browse/SPARK-19712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715469#comment-16715469
]
ASF GitHub Bot commented on SPARK-19712:
----------------------------------------
dilipbiswal edited a comment on issue #23211: [SPARK-19712][SQL] Move
PullupCorrelatedPredicates and RewritePredicateSubquery after OptimizeSubqueries
URL: https://github.com/apache/spark/pull/23211#issuecomment-445945148
@cloud-fan Just to make sure, so we want this new rule and associated tests
to verify the pushdown of left semi/anti joins. We would keep the subquery
rewrite at the same place first i.e not move it up in the new PR, correct ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> EXISTS and Left Semi join do not produce the same plan
> ------------------------------------------------------
>
> Key: SPARK-19712
> URL: https://issues.apache.org/jira/browse/SPARK-19712
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 2.1.0
> Reporter: Nattavut Sutyanyong
> Priority: Major
>
> This problem was found during the development of SPARK-18874.
> The EXISTS form in the following query:
> {{sql("select * from t1 inner join t2 on t1.t1a=t2.t2a where exists (select 1
> from t3 where t1.t1b=t3.t3b)")}}
> gives the optimized plan below:
> {code}
> == Optimized Logical Plan ==
> Join Inner, (t1a#7 = t2a#25)
> :- Join LeftSemi, (t1b#8 = t3b#58)
> : :- Filter isnotnull(t1a#7)
> : : +- Relation[t1a#7,t1b#8,t1c#9] parquet
> : +- Project [1 AS 1#271, t3b#58]
> : +- Relation[t3a#57,t3b#58,t3c#59] parquet
> +- Filter isnotnull(t2a#25)
> +- Relation[t2a#25,t2b#26,t2c#27] parquet
> {code}
> whereas a semantically equivalent Left Semi join query below:
> {{sql("select * from t1 inner join t2 on t1.t1a=t2.t2a left semi join t3 on
> t1.t1b=t3.t3b")}}
> gives the following optimized plan:
> {code}
> == Optimized Logical Plan ==
> Join LeftSemi, (t1b#8 = t3b#58)
> :- Join Inner, (t1a#7 = t2a#25)
> : :- Filter (isnotnull(t1b#8) && isnotnull(t1a#7))
> : : +- Relation[t1a#7,t1b#8,t1c#9] parquet
> : +- Filter isnotnull(t2a#25)
> : +- Relation[t2a#25,t2b#26,t2c#27] parquet
> +- Project [t3b#58]
> +- Relation[t3a#57,t3b#58,t3c#59] parquet
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]