Thanks Sean, got it.

Thanks,
Elango

On Thu, May 28, 2020, 9:04 PM Sean Owen <sro...@gmail.com> wrote:

> I don't think so, that data is inherently ambiguous and incorrectly
> formatted. If you know something about the structure, maybe you can rewrite
> the middle column manually to escape the inner quotes and then reparse.
>
> On Thu, May 28, 2020 at 10:25 AM elango vaidyanathan <elango...@gmail.com>
> wrote:
>
>> Is there any way I can handle it in code?
>>
>> Thanks,
>> Elango
>>
>> On Thu, May 28, 2020, 8:52 PM Sean Owen <sro...@gmail.com> wrote:
>>
>>> Your data doesn't escape double-quotes.
>>>
>>> On Thu, May 28, 2020 at 10:21 AM elango vaidyanathan <
>>> elango...@gmail.com> wrote:
>>>
>>>>
>>>> Hi team,
>>>>
>>>> I am loading an CSV. One column contains a json value. I am unable to
>>>> parse that column properly. Below is the details. Can you please check 
>>>> once?
>>>>
>>>>
>>>>
>>>> val df1=spark.read.option("inferSchema","true").
>>>> option("header","true").option("quote", "\"")
>>>>
>>>> .option("escape",
>>>> "\"").csv("/FileStore/tables/sample_file_structure.csv")
>>>>
>>>>
>>>>
>>>> sample data:
>>>>
>>>> ----------------
>>>>
>>>> column1,column2,column3
>>>>
>>>> 123456789,"{   "moveId" : "123456789",   "dob" : null,   "username" :
>>>> "abcdef",   "language" : "en" }",11
>>>>
>>>> 123456789,"{   "moveId" : "123456789",   "dob" : null,   "username" :
>>>> "ghi, jkl",   "language" : "en" }",12 123456789,"{   "moveId" :
>>>> "123456789",   "dob" : null,   "username" : "mno, pqr",   "language" : "en"
>>>> }",13
>>>>
>>>>
>>>>
>>>> output:
>>>>
>>>> -----------
>>>>
>>>> +---------+--------------------+---------------+
>>>>
>>>> | column1| column2| column3 |
>>>>
>>>> +---------+--------------------+---------------+
>>>>
>>>> |123456789|"{ "moveId" : "...| "dob" : null|
>>>>
>>>> |123456789|"{ "moveId" : "...| "dob" : null|
>>>>
>>>> +---------+--------------------+---------------+
>>>>
>>>>
>>>>
>>>> Thanks,
>>>> Elango
>>>>
>>>

Reply via email to