[GitHub] spark pull request #21380: [SPARK-24329][SQL] Remove comments filtering befo...

HyukjinKwon Tue, 22 May 2018 02:10:21 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21380#discussion_r189828536
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
 ---
    @@ -300,14 +302,11 @@ private[csv] object UnivocityParser {
           lines
         }
     
    -    val filteredLines: Iterator[String] =
    -      CSVUtils.filterCommentAndEmpty(linesWithoutHeader, options)
    --- End diff --
    
    @MaxGekk, It doesn't always mean that we have tests. Because it was there 
from the first place and I tried to remove it, then the tests were broken. I 
expected to be broken again but seems passed now. So, I'm just guessing that 
it's fixed.
    
    Usually we trust but we should be careful if there were some issues found. 
I don't think we should make this case special. I am not seeing meaningful 
improvement either.
    
    One nit is, BTW, the purpose of `ignoreLeadingWhiteSpaceInRead` and 
`ignoreTrailingWhiteSpaceInRead` are basically for trimming the whitespaces in 
the values not for skipping empty lines.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21380: [SPARK-24329][SQL] Remove comments filtering befo...

Reply via email to