Heath Abelson created SPARK-26040:
-------------------------------------

             Summary: CSV Row delimiters not consistent between platforms
                 Key: SPARK-26040
                 URL: https://issues.apache.org/jira/browse/SPARK-26040
             Project: Spark
          Issue Type: Bug
          Components: Java API
    Affects Versions: 2.3.0
            Reporter: Heath Abelson


Running a spark job on *nix platforms, only unix style row delimiters (\n) are 
recognized. When running the job on windows, only windows style delimiters are 
recognized (\r\n).

The result is that, when trying to read a csv generated my MS excel, on spark 
running on Linux, extra characters are included in field names and field values 
that are last on the line.

Ideally, the CSV parser would be able to handle the 2 different flavors of line 
endings regardless of what platform the job is being run on.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to