[GitHub] spark pull request #22656: [SPARK-25669][SQL] Check CSV header only when it ...

HyukjinKwon Mon, 08 Oct 2018 22:51:02 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22656#discussion_r223566522
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
    @@ -505,7 +505,8 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
         val actualSchema =
           StructType(schema.filterNot(_.name == 
parsedOptions.columnNameOfCorruptRecord))
     
    -    val linesWithoutHeader: RDD[String] = maybeFirstLine.map { firstLine =>
    +    val linesWithoutHeader = if (parsedOptions.headerFlag && 
maybeFirstLine.isDefined) {
    --- End diff --
    
    LGTM but it really needs some refactoring. Let me give a shot



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22656: [SPARK-25669][SQL] Check CSV header only when it ...

Reply via email to