Liang-Chi Hsieh created SPARK-25363: ---------------------------------------
Summary: Schema pruning doesn't work if nested column is used in where clause Key: SPARK-25363 URL: https://issues.apache.org/jira/browse/SPARK-25363 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.4.0 Reporter: Liang-Chi Hsieh Schema pruning doesn't work if nested column is used in where clause. For example, {code} sql("select name.first from contacts where name.first = 'David'") == Physical Plan == *(1) Project [name#19.first AS first#40] +- *(1) Filter (isnotnull(name#19) && (name#19.first = David)) +- *(1) FileScan parquet [name#19] Batched: false, Format: Parquet, PartitionFilters: [], PushedFilters: [IsNotNull(name)], ReadSchema: struct<name:struct<first:string,middle:string,last:string>> {code} In above query plan, the scan node reads the entire schema of `name` column. This issue is reported by: https://github.com/apache/spark/pull/21320#issuecomment-419290197 -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org