[GitHub] spark pull request #21295: [SPARK-24230][SQL] Fix SpecificParquetRecordReade...

cloud-fan Wed, 23 May 2018 06:49:14 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21295#discussion_r190252764
  
    --- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java
 ---
    @@ -225,7 +226,8 @@ protected void initialize(String path, List<String> 
columns) throws IOException
         this.sparkSchema = new 
ParquetToSparkSchemaConverter(config).convert(requestedSchema);
         this.reader = new ParquetFileReader(
             config, footer.getFileMetaData(), file, blocks, 
requestedSchema.getColumns());
    -    for (BlockMetaData block : blocks) {
    +    // use the blocks from the reader in case some do not match filters 
and will not be read
    +    for (BlockMetaData block : reader.getRowGroups()) {
    --- End diff --
    
    I think this is an existing issue, does your test case fail on Spark 2.3 
too?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21295: [SPARK-24230][SQL] Fix SpecificParquetRecordReade...

Reply via email to