Hi, I am trying to load a json file compress in .tar.bz2 but spark throw an error. I am using pyspark with spark 1.6.2. (Cloudera 5.9)
What will be the best way to handle that? I don’t want to have a non-spark job that will just uncompressed the data… thanks