[
https://issues.apache.org/jira/browse/SPARK-26804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760036#comment-16760036
]
Raj edited comment on SPARK-26804 at 2/4/19 5:22 PM:
-----------------------------------------------------
Hi Hyukjin,
I have attached the sample file to reproduce the same at your end. Also see
below commands I use in databricks which will help to recreate the issue. Third
command box in this screenshot, you can see the Col3 that has extra character
highlighted in blue which appears on double clicking the column.
To analyse further, you can download this as csv file and see the extra
character.
Note: If I remove multiline = true option, the columns works great as the extra
char gets removed from my last column but, as my data has values with multi
lines, so I need this option to set.
!image-2019-02-04-12-09-19-210.png!
[^TestFile.csv]
^Hope this helps^
^Thanks,^
^Raj^
was (Author: hipruthvi):
Hi Hyukjin,
I have attached the sample file to reproduce the same at your end. Also see
below commands I use in databricks which will help to recreate the issue. Third
command box in this screenshot, you can see the Col3 that has extra character
highlighted in blue which appears on double clicking the column.
To analyse further, you can download this column names in Excel file and see
the extra character.
Note: If I remove multiline = true option, the columns works great as the extra
char gets removed from my last column but, as my data has values with multi
lines, so I need this option to set.
!image-2019-02-04-12-09-19-210.png!
[^TestFile.csv]
^Hope this helps^
^Thanks,^
^Raj^
> Spark sql carries newline char from last csv column when imported
> -----------------------------------------------------------------
>
> Key: SPARK-26804
> URL: https://issues.apache.org/jira/browse/SPARK-26804
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.4.0
> Reporter: Raj
> Priority: Major
> Attachments: TestFile.csv, image-2019-02-04-12-09-19-210.png
>
>
> I am trying to generate external sql tables in DataBricks using Spark sql
> query. Below is my query. The query reads csv file and creates external table
> but it carries the newline char while creating the last column. Is there a
> way to resolve this issue?
>
> %sql
> create table if not exists <<My table name>>
> using CSV
> options ("header"="true", "inferschema"="true","multiLine"="true",
> "escape"='"')
> location <my csv path>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]