Hi, can you please give us a simple map of what the input is and what the output should be like? From your description it looks a bit difficult to figure out what exactly or how exactly you want the records actually parsed.
Regards, Gourav Sengupta On Wed, May 25, 2022 at 9:08 PM Sid <flinkbyhe...@gmail.com> wrote: > Hi Experts, > > I have below CSV data that is getting generated automatically. I can't > change the data manually. > > The data looks like below: > > 2020-12-12,abc,2000,,INR, > 2020-12-09,cde,3000,he is a manager,DOLLARS,nothing > 2020-12-09,fgh,,software_developer,I only manage the development part. > > Since I don't have much experience with the other domains. > > It is handled by the other people.,INR > 2020-12-12,abc,2000,,USD, > > The third record is a problem. Since the value is separated by the new > line by the user while filling up the form. So, how do I handle this? > > There are 6 columns and 4 records in total. These are the sample records. > > Should I load it as RDD and then may be using a regex should eliminate the > new lines? Or how it should be? with ". /n" ? > > Any suggestions? > > Thanks, > Sid >