[GitHub] spark pull request #22367: [SPARK-17916][SPARK-25241][SQL][FOLLOWUP] Fix emp...

HyukjinKwon Sun, 09 Sep 2018 21:41:40 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22367#discussion_r216196505
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala
 ---
    @@ -91,9 +91,10 @@ abstract class CSVDataSource extends Serializable {
           }
     
           row.zipWithIndex.map { case (value, index) =>
    -        if (value == null || value.isEmpty || value == options.nullValue) {
    -          // When there are empty strings or the values set in 
`nullValue`, put the
    -          // index as the suffix.
    +        if (value == null || value.isEmpty || value == options.nullValue ||
    +          value == options.emptyValueInRead) {
    --- End diff --
    
    @MaxGekk, can we get rid of this one (and the one in 
`CSVInferSchema.scala`) for now since we target 2.4? IIRC (need to double 
check), this behaviour by `makeSafeHeader` is from R's `read.csv`. We should 
check if it is coherent or not.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22367: [SPARK-17916][SPARK-25241][SQL][FOLLOWUP] Fix emp...

Reply via email to