Maxim Gekk created SPARK-24329:
----------------------------------

             Summary: Remove comments filtering before parsing of CSV files
                 Key: SPARK-24329
                 URL: https://issues.apache.org/jira/browse/SPARK-24329
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.3.0
            Reporter: Maxim Gekk


Comments and whitespace filtering has been performed by uniVocity parser 
already according to parser settings:
https://github.com/apache/spark/blob/branch-2.3/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVOptions.scala#L178-L180

It is not necessary to do the same before parsing. Need to inspect all places 
where the filterCommentAndEmpty method is called, and remove the former one if 
it duplicates filtering of uniVocity parser.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to