[jira] [Comment Edited] (SPARK-26804) Spark sql carries newline char from last csv column when imported

Raj (JIRA) Mon, 04 Feb 2019 09:25:17 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-26804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760036#comment-16760036
 ]


Raj edited comment on SPARK-26804 at 2/4/19 5:22 PM:
-----------------------------------------------------

Hi Hyukjin,

    I have attached the sample file to reproduce the same at your end. Also see 
below commands I use in databricks which will help to recreate the issue. Third 
command box in this screenshot, you can see the Col3 that has extra character 
highlighted in blue which appears on double clicking the column.

To analyse further, you can download this as csv file and see the extra 
character. 

Note: If I remove multiline = true option, the columns works great as the extra 
char gets removed from my last column but, as my data has values with multi 
lines, so I need this option to set.

 

!image-2019-02-04-12-09-19-210.png!

[^TestFile.csv]

 

^Hope this helps^

^Thanks,^

^Raj^

 


was (Author: hipruthvi):
Hi Hyukjin,

    I have attached the sample file to reproduce the same at your end. Also see 
below commands I use in databricks which will help to recreate the issue. Third 
command box in this screenshot, you can see the Col3 that has extra character 
highlighted in blue which appears on double clicking the column.

To analyse further, you can download this column names in Excel file and see 
the extra character. 

Note: If I remove multiline = true option, the columns works great as the extra 
char gets removed from my last column but, as my data has values with multi 
lines, so I need this option to set.

 

!image-2019-02-04-12-09-19-210.png!

[^TestFile.csv]

 

^Hope this helps^

^Thanks,^

^Raj^

 

> Spark sql carries newline char from last csv column when imported
> -----------------------------------------------------------------
>
>                 Key: SPARK-26804
>                 URL: https://issues.apache.org/jira/browse/SPARK-26804
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Raj
>            Priority: Major
>         Attachments: TestFile.csv, image-2019-02-04-12-09-19-210.png
>
>
> I am trying to generate external sql tables in DataBricks using Spark sql 
> query. Below is my query. The query reads csv file and creates external table 
> but it carries the newline char while creating the last column. Is there a 
> way to resolve this issue? 
>  
> %sql
> create table if not exists <<My table name>>
> using CSV
> options ("header"="true", "inferschema"="true","multiLine"="true", 
> "escape"='"')
> location <my csv path>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SPARK-26804) Spark sql carries newline char from last csv column when imported

Reply via email to