[GitHub] spark pull request #21657: [SPARK-24676][SQL] Project required data from CSV...

MaxGekk Wed, 04 Jul 2018 14:57:11 -0700

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21657#discussion_r200197250
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
 ---
    @@ -82,7 +83,12 @@ class UnivocityParser(
       //
       //   output row - ["A", 2]
       private val valueConverters: Array[ValueConverter] = {
    -    schema.map(f => makeConverter(f.name, f.dataType, f.nullable, 
options)).toArray
    +    requiredSchema.map(f => makeConverter(f.name, f.dataType, f.nullable, 
options)).toArray
    +  }
    +
    +  // If `columnPruning` disabled, this index is used to reorder parsed 
tokens
    +  private lazy val tokenIndexArr: Array[Int] = {
    +    requiredSchema.map(f => dataSchema.indexOf(f)).toArray
    --- End diff --
    
    I would apply this small optimization: 
`java.lang.Integer.valueOf(dataSchema.indexOf(f))`



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21657: [SPARK-24676][SQL] Project required data from CSV...

Reply via email to