Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20894#discussion_r188543968
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -497,6 +498,11 @@ class DataFrameReader private[sql](sparkSession:
SparkSession) extends Logging {
StructType(schema.filterNot(_.name ==
parsedOptions.columnNameOfCorruptRecord))
val linesWithoutHeader: RDD[String] = maybeFirstLine.map { firstLine =>
+ if (parsedOptions.enforceSchema == false) {
+ CSVDataSource.checkHeader(firstLine, new
CsvParser(parsedOptions.asParserSettings),
--- End diff --
Does this logic ever validate each file's header at all?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]