[ https://issues.apache.org/jira/browse/SPARK-25207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yuming Wang updated SPARK-25207: -------------------------------- Issue Type: Sub-task (was: Bug) Parent: SPARK-25419 > Case-insensitve field resolution for filter pushdown when reading Parquet > ------------------------------------------------------------------------- > > Key: SPARK-25207 > URL: https://issues.apache.org/jira/browse/SPARK-25207 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 2.4.0 > Reporter: yucai > Assignee: yucai > Priority: Major > Labels: Parquet > Fix For: 2.4.0 > > Attachments: image.png > > > Currently, filter pushdown will not work if Parquet schema and Hive metastore > schema are in different letter cases even spark.sql.caseSensitive is false. > Like the below case: > {code:java} > spark.range(10).write.parquet("/tmp/data") > sql("DROP TABLE t") > sql("CREATE TABLE t (ID LONG) USING parquet LOCATION '/tmp/data'") > sql("select * from t where id > 0").show{code} > -No filter will be pushed down.- > {code} > scala> sql("select * from t where id > 0").explain // Filters are pushed > with `ID` > == Physical Plan == > *(1) Project [ID#90L] > +- *(1) Filter (isnotnull(id#90L) && (id#90L > 0)) > +- *(1) FileScan parquet default.t[ID#90L] Batched: true, Format: Parquet, > Location: InMemoryFileIndex[file:/tmp/data], PartitionFilters: [], > PushedFilters: [IsNotNull(ID), GreaterThan(ID,0)], ReadSchema: > struct<ID:bigint> > scala> sql("select * from t").show // Parquet returns NULL for `ID` > because it has `id`. > +----+ > | ID| > +----+ > |null| > |null| > |null| > |null| > |null| > |null| > |null| > |null| > |null| > |null| > +----+ > scala> sql("select * from t where id > 0").show // `NULL > 0` is `false`. > +---+ > | ID| > +---+ > +---+ > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org