[GitHub] spark pull request #21380: [SPARK-24329][SQL] Remove comments filtering befo...

MaxGekk Tue, 22 May 2018 00:02:05 -0700

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21380#discussion_r189794437
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
 ---
    @@ -300,14 +302,11 @@ private[csv] object UnivocityParser {
           lines
         }
     
    -    val filteredLines: Iterator[String] =
    -      CSVUtils.filterCommentAndEmpty(linesWithoutHeader, options)
    --- End diff --
    
    Probably, you observed issues in old versions of uniVocity parser as 
@maropu wrote above. I would propose to remove the filtering till we face to 
the cases when uniVocity's filter doesn't work as it is expected. So, we would 
submit an issue for uniVocity and revert the changes back.
    
    > I think we already do such things in Spark side redundantly to make sure 
in few places.
    
    I looked at another places where we do the same but this is only the place 
where we do filtering directly before uniVocity.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21380: [SPARK-24329][SQL] Remove comments filtering befo...

Reply via email to