Reading TB of JSON file

Chetan Khatri Thu, 18 Jun 2020 06:12:07 -0700

Hi Spark Users,

I have a 50GB of JSON file, I would like to read and persist at HDFS so it
can be taken into next transformation. I am trying to read as
spark.read.json(path) but this is giving Out of memory error on driver.
Obviously, I can't afford having 50 GB on driver memory. In general, what
is the best practice to read large JSON file like 50 GB?


Thanks

Reading TB of JSON file

Reply via email to