Github user gengliangwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/20894#discussion_r188535925
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -497,6 +498,11 @@ class DataFrameReader private[sql](sparkSession:
SparkSession) extends Logging {
StructType(schema.filterNot(_.name ==
parsedOptions.columnNameOfCorruptRecord))
val linesWithoutHeader: RDD[String] = maybeFirstLine.map { firstLine =>
+ if (parsedOptions.enforceSchema == false) {
+ CSVDataSource.checkHeader(firstLine, new
CsvParser(parsedOptions.asParserSettings),
--- End diff --
The function `checkHeader` is also called in `readFile`. Is it possible
that it is called twice?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]