[GitHub] spark pull request: [SPARK-12644][SQL] Update parquet reader to be...

rxin Thu, 14 Jan 2016 21:12:10 -0800

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10593#discussion_r49821548
  
    --- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/UnsafeRowParquetRecordReader.java
 ---
    @@ -103,6 +105,25 @@
       private static final int DEFAULT_VAR_LEN_SIZE = 32;
     
       /**
    +   * columnBatch object that is used for batch decoding. This is created 
on first use and triggers
    +   * batched decoding. It is not valid to interleave calls to the batched 
interface with the row
    +   * by row RecordReader APIs.
    +   * This is only enabled with additional flags for development. This is 
still a work in progress
    +   * and currently unsupported cases will fail with potentially difficult 
to diagnose errors.
    +   * This should be only turned on for development to work on this feature.
    +   *
    +   * TODOs:
    +   *  - Implement all the encodings to support vectorized.
    +   *  - Implement v2 page formats (just make sure we create the correct 
decoders).
    +   */
    +  private ColumnarBatch columnarBatch;
    +
    +  /**
    +   * The default config on whether columnarBatch should be offheap.
    +   */
    +  private static final boolean DEFAULT_OFFHEAP = false;
    --- End diff --
    
    this naming is fairly confusing.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-12644][SQL] Update parquet reader to be...

Reply via email to