Using Spark to process JSON with gzip filed

Eran Witkon Wed, 16 Dec 2015 08:33:00 -0800

Hi,
I have a few JSON files in which one of the field is a binary filed - this
field is the output of running GZIP of a JSON stream and compressing it to
the binary field.


Now I want to de-compress the field and get the outpur JSON.
I was thinking of running map operation and passing a function to the map
operation which will decompress each JSON file.
the above function will find the right field in the outer JSON and then run
GUNZIP on it.

1) is this a valid practice for spark map job?
2) any pointer on how to do that?

Eran

Using Spark to process JSON with gzip filed

Reply via email to