Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/20894#discussion_r191819577
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala
---
@@ -206,12 +290,17 @@ object MultiLineCSVDataSource extends CSVDataSource {
conf: Configuration,
file: PartitionedFile,
parser: UnivocityParser,
- schema: StructType): Iterator[InternalRow] = {
+ requiredSchema: StructType,
+ dataSchema: StructType,
+ caseSensitive: Boolean): Iterator[InternalRow] = {
+ def checkHeader(header: Array[String]): Unit = {
+ CSVDataSource.checkHeaderColumnNames(dataSchema, header,
file.filePath,
+ parser.options.enforceSchema, caseSensitive)
+ }
+
UnivocityParser.parseStream(
CodecStreams.createInputStreamWithCloseResource(conf, new Path(new
URI(file.filePath))),
- parser.options.headerFlag,
- parser,
- schema)
+ parser.options.headerFlag, parser, requiredSchema, checkHeader)
--- End diff --
Nit: indents. We should follow the original way
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]