[ https://issues.apache.org/jira/browse/SPARK-23540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
KaiXinXIaoLei resolved SPARK-23540. ----------------------------------- Resolution: Duplicate > The `where exists' action in optimized logical plan should be optimized > ------------------------------------------------------------------------ > > Key: SPARK-23540 > URL: https://issues.apache.org/jira/browse/SPARK-23540 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.3.0 > Reporter: KaiXinXIaoLei > Priority: Major > > The optimized logical plan of query 'select * from tt1 where exists (select * > from tt2 where tt1.i = tt2.i);` is : > >== Optimized Logical Plan == > >Join LeftSemi, (i#143 = i#145) > >:- MetastoreRelation default, tt1 > >+- MetastoreRelation default, tt2 > But the query of `select * from tt1 left semi join tt2 on tt2.i = tt1.i` is : > >== Optimized Logical Plan == > Join LeftSemi, (i#152 = i#150) > :- Filter isnotnull(i#150) > : +- MetastoreRelation default, tt1 > +- Project [i#152|#152] > +- MetastoreRelation default, tt2 > > So i think the optimized logical plan of 'select * from tt1 where exists > (select * from tt2 where tt1.i = tt2.i);` should be further optimization. > > == Optimized Logical Plan == > Join LeftSemi, (i#143 = i#145) > :- MetastoreRelation default, tt1 > +- MetastoreRelation default, tt2 -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org