[ https://issues.apache.org/jira/browse/SPARK-22662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li resolved SPARK-22662. ----------------------------- Resolution: Fixed Assignee: Zhenhua Wang Fix Version/s: 2.3.0 > Failed to prune columns after rewriting predicate subquery > ---------------------------------------------------------- > > Key: SPARK-22662 > URL: https://issues.apache.org/jira/browse/SPARK-22662 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.3.0 > Reporter: Zhenhua Wang > Assignee: Zhenhua Wang > Fix For: 2.3.0 > > > As a simple example: > {code} > spark-sql> create table base (a int, b int) using parquet; > Time taken: 0.066 seconds > spark-sql> create table relInSubq ( x int, y int, z int) using parquet; > Time taken: 0.042 seconds > spark-sql> explain select a from base where a in (select x from relInSubq); > == Physical Plan == > *Project [a#83] > +- *BroadcastHashJoin [a#83], [x#85], LeftSemi, BuildRight > :- *FileScan parquet default.base[a#83,b#84] Batched: true, Format: > Parquet, Location: InMemoryFileIndex[hdfs://100.0.0.4:9000/wzh/base], > PartitionFilters: [], PushedFilters: [], ReadSchema: struct<a:int,b:int> > +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, > true] as bigint))) > +- *Project [x#85] > +- *FileScan parquet default.relinsubq[x#85] Batched: true, Format: > Parquet, Location: InMemoryFileIndex[hdfs://100.0.0.4:9000/wzh/relinsubq], > PartitionFilters: [], PushedFilters: [], ReadSchema: struct<x:int> > {code} > We only need column `a` in table `base`, but all columns (`a`, `b`) are > fetched. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org