RE: Problem with CSV line break data in PySpark 2.1.0

2017-09-05 Thread JG Perrin
Have you tried the built-in parser, not the databricks one (which is not really used anymore)? What is your original CSV looking like? What is your code looking like? There are quite a few options to read a CSVā€¦ From: Aakash Basu [mailto:aakash.spark@gmail.com] Sent: Sunday, September 03,

Re: Problem with CSV line break data in PySpark 2.1.0

2017-09-03 Thread Riccardo Ferrari
Hi Aakash, What I see in the picture seems correct. Spark (pyspark) is reading your F2 cell as a multi-line text. Where are the nulls you're referring to? You might find the pyspark.sql.functions.regexp_replace