Thanks Sean, got it. Thanks, Elango
On Thu, May 28, 2020, 9:04 PM Sean Owen <sro...@gmail.com> wrote: > I don't think so, that data is inherently ambiguous and incorrectly > formatted. If you know something about the structure, maybe you can rewrite > the middle column manually to escape the inner quotes and then reparse. > > On Thu, May 28, 2020 at 10:25 AM elango vaidyanathan <elango...@gmail.com> > wrote: > >> Is there any way I can handle it in code? >> >> Thanks, >> Elango >> >> On Thu, May 28, 2020, 8:52 PM Sean Owen <sro...@gmail.com> wrote: >> >>> Your data doesn't escape double-quotes. >>> >>> On Thu, May 28, 2020 at 10:21 AM elango vaidyanathan < >>> elango...@gmail.com> wrote: >>> >>>> >>>> Hi team, >>>> >>>> I am loading an CSV. One column contains a json value. I am unable to >>>> parse that column properly. Below is the details. Can you please check >>>> once? >>>> >>>> >>>> >>>> val df1=spark.read.option("inferSchema","true"). >>>> option("header","true").option("quote", "\"") >>>> >>>> .option("escape", >>>> "\"").csv("/FileStore/tables/sample_file_structure.csv") >>>> >>>> >>>> >>>> sample data: >>>> >>>> ---------------- >>>> >>>> column1,column2,column3 >>>> >>>> 123456789,"{ "moveId" : "123456789", "dob" : null, "username" : >>>> "abcdef", "language" : "en" }",11 >>>> >>>> 123456789,"{ "moveId" : "123456789", "dob" : null, "username" : >>>> "ghi, jkl", "language" : "en" }",12 123456789,"{ "moveId" : >>>> "123456789", "dob" : null, "username" : "mno, pqr", "language" : "en" >>>> }",13 >>>> >>>> >>>> >>>> output: >>>> >>>> ----------- >>>> >>>> +---------+--------------------+---------------+ >>>> >>>> | column1| column2| column3 | >>>> >>>> +---------+--------------------+---------------+ >>>> >>>> |123456789|"{ "moveId" : "...| "dob" : null| >>>> >>>> |123456789|"{ "moveId" : "...| "dob" : null| >>>> >>>> +---------+--------------------+---------------+ >>>> >>>> >>>> >>>> Thanks, >>>> Elango >>>> >>>