[ https://issues.apache.org/jira/browse/SPARK-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Cheng Lian updated SPARK-1913: ------------------------------ Description: When scanning Parquet tables, attributes referenced only in predicates that are pushed down are not passed to the `ParquetTableScan` operator and causes exception. Verified in the {{sbt hive/console}}: {code:scala} loadTestTable("src") table("src").saveAsParquetFile("src.parquet") parquetFile("src.parquet").registerAsTable("src_parquet") hql("SELECT value FROM src_parquet WHERE key < 10").collect().foreach(println) {code} was: case class Person(name: String, age: Int) if we use Parquet file, the following statement will throw a exception says java.lang.IllegalArgumentException: Column age does not exist. sql("SELECT name FROM parquetFile WHERE age >= 13 AND age <= 19") we have to add age column after SELECT in order to make it right: sql("SELECT name , age FROM parquetFile WHERE age >= 13 AND age <= 19") The same error also occurs when we use DSL: parquetFile.where('key === 1).select('value as 'a).collect().foreach(println) will tell us can not find column 'key',we have to fix like this : parquetFile.where('key === 1).select('key ,'value as 'a).collect().foreach(println) Obviously, that's not the way we want! > column pruning problem of Parquet File > --------------------------------------- > > Key: SPARK-1913 > URL: https://issues.apache.org/jira/browse/SPARK-1913 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.0.0 > Environment: mac os 10.9.2 > Reporter: Chen Chao > > When scanning Parquet tables, attributes referenced only in predicates that > are pushed down are not passed to the `ParquetTableScan` operator and causes > exception. Verified in the {{sbt hive/console}}: > {code:scala} > loadTestTable("src") > table("src").saveAsParquetFile("src.parquet") > parquetFile("src.parquet").registerAsTable("src_parquet") > hql("SELECT value FROM src_parquet WHERE key < 10").collect().foreach(println) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)