Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/20894#discussion_r178578044
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala
---
@@ -50,7 +50,9 @@ abstract class CSVDataSource extends Serializable {
conf: Configuration,
file: PartitionedFile,
parser: UnivocityParser,
- schema: StructType): Iterator[InternalRow]
+ schema: StructType, // Schema of projection
+ dataSchema: StructType // Schema of data in csv files
+ ): Iterator[InternalRow]
--- End diff --
```Scala
/**
* Parse a [[PartitionedFile]] into [[InternalRow]] instances.
* @param requiredSchema: Schema of projection
* @param dataSchema: Schema of data in CSV files
*/
def readFile(
conf: Configuration,
file: PartitionedFile,
parser: UnivocityParser,
requiredSchema: StructType,
dataSchema: StructType): Iterator[InternalRow]```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]