Github user ghoto commented on a diff in the pull request:
https://github.com/apache/spark/pull/21086#discussion_r188473831
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -351,12 +338,26 @@ class ParquetFileFormat
val timestampConversion: Boolean =
sparkSession.sessionState.conf.isParquetINT96TimestampConversion
val capacity = sqlConf.parquetVectorizedReaderBatchSize
+ val enableParquetFilterPushDown: Boolean =
+ sparkSession.sessionState.conf.parquetFilterPushDown
// Whole stage codegen (PhysicalRDD) is able to deal with batches
directly
val returningBatch = supportBatch(sparkSession, resultSchema)
(file: PartitionedFile) => {
assert(file.partitionValues.numFields == partitionSchema.size)
+ // Try to push down filters when filter push-down is enabled.
--- End diff --
So this code is the same as before. How can this solve the bug described in
the head of the Conversation?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]