[GitHub] spark pull request #20894: [SPARK-23786][SQL] Checking column names of csv h...

gatorsmile Mon, 02 Apr 2018 09:10:21 -0700

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20894#discussion_r178578044
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala
 ---
    @@ -50,7 +50,9 @@ abstract class CSVDataSource extends Serializable {
           conf: Configuration,
           file: PartitionedFile,
           parser: UnivocityParser,
    -      schema: StructType): Iterator[InternalRow]
    +      schema: StructType, // Schema of projection
    +      dataSchema: StructType // Schema of data in csv files
    +  ): Iterator[InternalRow]
    --- End diff --
    
    ```Scala
      /**
       * Parse a [[PartitionedFile]] into [[InternalRow]] instances.
       * @param requiredSchema: Schema of projection
       * @param dataSchema: Schema of data in CSV files
       */
      def readFile(
          conf: Configuration,
          file: PartitionedFile,
          parser: UnivocityParser,
          requiredSchema: StructType,
          dataSchema: StructType): Iterator[InternalRow]```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20894: [SPARK-23786][SQL] Checking column names of csv h...

Reply via email to