Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22656#discussion_r223566522
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -505,7 +505,8 @@ class DataFrameReader private[sql](sparkSession:
SparkSession) extends Logging {
val actualSchema =
StructType(schema.filterNot(_.name ==
parsedOptions.columnNameOfCorruptRecord))
- val linesWithoutHeader: RDD[String] = maybeFirstLine.map { firstLine =>
+ val linesWithoutHeader = if (parsedOptions.headerFlag &&
maybeFirstLine.isDefined) {
--- End diff --
LGTM but it really needs some refactoring. Let me give a shot
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]