Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/20894#discussion_r188651192
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -497,6 +498,11 @@ class DataFrameReader private[sql](sparkSession:
SparkSession) extends Logging {
StructType(schema.filterNot(_.name ==
parsedOptions.columnNameOfCorruptRecord))
val linesWithoutHeader: RDD[String] = maybeFirstLine.map { firstLine =>
+ if (parsedOptions.enforceSchema == false) {
+ CSVDataSource.checkHeader(firstLine, new
CsvParser(parsedOptions.asParserSettings),
--- End diff --
It should check CSV header each time when they are removed. Do you think I
missed some cases?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]