[GitHub] spark pull request #21086: [SPARK-24002] [SQL] Task not serializable caused ...

ghoto Tue, 15 May 2018 17:24:24 -0700

Github user ghoto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21086#discussion_r188473831
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
 ---
    @@ -351,12 +338,26 @@ class ParquetFileFormat
         val timestampConversion: Boolean =
           sparkSession.sessionState.conf.isParquetINT96TimestampConversion
         val capacity = sqlConf.parquetVectorizedReaderBatchSize
    +    val enableParquetFilterPushDown: Boolean =
    +      sparkSession.sessionState.conf.parquetFilterPushDown
         // Whole stage codegen (PhysicalRDD) is able to deal with batches 
directly
         val returningBatch = supportBatch(sparkSession, resultSchema)
     
         (file: PartitionedFile) => {
           assert(file.partitionValues.numFields == partitionSchema.size)
     
    +      // Try to push down filters when filter push-down is enabled.
    --- End diff --
    
    So this code is the same as before. How can this solve the bug described in 
the head of the Conversation?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21086: [SPARK-24002] [SQL] Task not serializable caused ...

Reply via email to