[jira] [Updated] (SPARK-42661) CSV Reader - multiline without quoted fields

Florian FERREIRA (Jira) Fri, 03 Mar 2023 03:19:04 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-42661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Florian FERREIRA updated SPARK-42661:
-------------------------------------
    Attachment: Capture d’écran 2023-03-03 à 12.18.07.png

> CSV Reader - multiline without quoted fields
> --------------------------------------------
>
>                 Key: SPARK-42661
>                 URL: https://issues.apache.org/jira/browse/SPARK-42661
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.3.1
>         Environment: unquoted data
> {code}
> NAME,Address,CITY
> Atlassian,Level 6 341 George Street
> Sydney NSW 2000 Australia,Sydney
> Github,88 Colin P Kelly Junior Street
> San Francisco CA 94107 USA,San Francisco
> {code}
> quoted data : 
> {code}
> "NAME","Address","CITY"
> "Atlassian","Level 6 341 George Street
> Sydney NSW 2000 Australia","Sydney"
> "Github","88 Colin P Kelly Junior Street
> San Francisco CA 94107 USA","San Francisco"
> {code}
>            Reporter: Florian FERREIRA
>            Priority: Minor
>         Attachments: Capture d’écran 2023-03-03 à 12.18.07.png
>
>
> Hello,
> We are facing an issue with the CSV format.
> When we try to read a "multiline file without quoted fields" the expected 
> result is not good.
> With quoted fields, all is ok. ( cf the screenshot ) 
> You can reproduce it easily with this code (just replace file path ) :
> {code:java}
> spark.read.options(Map(
>         "multiline" -> "true",
>         "quote" -> "",
>         "header" -> "true",
>       )).csv("/Users/fferreira/correct_multiline.csv").show(false)
> spark.read.options(Map(
>         "multiline" -> "true",
>         "header" -> "true",      
> )).csv("/Users/fferreira/correct_multiline_with_quote.csv").show(false)
> {code}
> We continue to investigate on our side.
> Thanks you.
> !image-2023-03-03-12-11-21-258.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-42661) CSV Reader - multiline without quoted fields

Reply via email to