[
https://issues.apache.org/jira/browse/SPARK-19712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nattavut Sutyanyong updated SPARK-19712:
----------------------------------------
Description:
This problem was found during the development of SPARK-18874.
The EXISTS form in the following query:
{{sql("select * from t1 inner join t2 on t1.t1a=t2.t2a where exists (select 1
from t3 where t1.t1b=t3.t3b)")}}
gives the optimized plan below:
{code}
== Optimized Logical Plan ==
Join Inner, (t1a#7 = t2a#25)
:- Join LeftSemi, (t1b#8 = t3b#58)
: :- Filter isnotnull(t1a#7)
: : +- Relation[t1a#7,t1b#8,t1c#9] parquet
: +- Project [1 AS 1#271, t3b#58]
: +- Relation[t3a#57,t3b#58,t3c#59] parquet
+- Filter isnotnull(t2a#25)
+- Relation[t2a#25,t2b#26,t2c#27] parquet
{code}
whereas a semantically equivalent Left Semi join query below:
{{sql("select * from t1 inner join t2 on t1.t1a=t2.t2a left semi join t3 on
t1.t1b=t3.t3b")}}
gives the following optimized plan:
{code}
== Optimized Logical Plan ==
Join LeftSemi, (t1b#8 = t3b#58)
:- Join Inner, (t1a#7 = t2a#25)
: :- Filter (isnotnull(t1b#8) && isnotnull(t1a#7))
: : +- Relation[t1a#7,t1b#8,t1c#9] parquet
: +- Filter isnotnull(t2a#25)
: +- Relation[t2a#25,t2b#26,t2c#27] parquet
+- Project [t3b#58]
+- Relation[t3a#57,t3b#58,t3c#59] parquet
{code}
was:
This problem is found during the development of SPARK-18874.
The EXISTS form in the following query:
{{sql("select * from t1 inner join t2 on t1.t1a=t2.t2a where exists (select 1
from t3 where t1.t1b=t3.t3b)")}}
gives the optimized plan below:
{code}
== Optimized Logical Plan ==
Join Inner, (t1a#7 = t2a#25)
:- Join LeftSemi, (t1b#8 = t3b#58)
: :- Filter isnotnull(t1a#7)
: : +- Relation[t1a#7,t1b#8,t1c#9] parquet
: +- Project [1 AS 1#271, t3b#58]
: +- Relation[t3a#57,t3b#58,t3c#59] parquet
+- Filter isnotnull(t2a#25)
+- Relation[t2a#25,t2b#26,t2c#27] parquet
{code}
whereas a semantically equivalent Left Semi join query below:
{{sql("select * from t1 inner join t2 on t1.t1a=t2.t2a left semi join t3 on
t1.t1b=t3.t3b")}}
gives the following optimized plan:
{code}
== Optimized Logical Plan ==
Join LeftSemi, (t1b#8 = t3b#58)
:- Join Inner, (t1a#7 = t2a#25)
: :- Filter (isnotnull(t1b#8) && isnotnull(t1a#7))
: : +- Relation[t1a#7,t1b#8,t1c#9] parquet
: +- Filter isnotnull(t2a#25)
: +- Relation[t2a#25,t2b#26,t2c#27] parquet
+- Project [t3b#58]
+- Relation[t3a#57,t3b#58,t3c#59] parquet
{code}
> EXISTS and Left Semi join do not produce the same plan
> ------------------------------------------------------
>
> Key: SPARK-19712
> URL: https://issues.apache.org/jira/browse/SPARK-19712
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 2.1.0
> Reporter: Nattavut Sutyanyong
>
> This problem was found during the development of SPARK-18874.
> The EXISTS form in the following query:
> {{sql("select * from t1 inner join t2 on t1.t1a=t2.t2a where exists (select 1
> from t3 where t1.t1b=t3.t3b)")}}
> gives the optimized plan below:
> {code}
> == Optimized Logical Plan ==
> Join Inner, (t1a#7 = t2a#25)
> :- Join LeftSemi, (t1b#8 = t3b#58)
> : :- Filter isnotnull(t1a#7)
> : : +- Relation[t1a#7,t1b#8,t1c#9] parquet
> : +- Project [1 AS 1#271, t3b#58]
> : +- Relation[t3a#57,t3b#58,t3c#59] parquet
> +- Filter isnotnull(t2a#25)
> +- Relation[t2a#25,t2b#26,t2c#27] parquet
> {code}
> whereas a semantically equivalent Left Semi join query below:
> {{sql("select * from t1 inner join t2 on t1.t1a=t2.t2a left semi join t3 on
> t1.t1b=t3.t3b")}}
> gives the following optimized plan:
> {code}
> == Optimized Logical Plan ==
> Join LeftSemi, (t1b#8 = t3b#58)
> :- Join Inner, (t1a#7 = t2a#25)
> : :- Filter (isnotnull(t1b#8) && isnotnull(t1a#7))
> : : +- Relation[t1a#7,t1b#8,t1c#9] parquet
> : +- Filter isnotnull(t2a#25)
> : +- Relation[t2a#25,t2b#26,t2c#27] parquet
> +- Project [t3b#58]
> +- Relation[t3a#57,t3b#58,t3c#59] parquet
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]