Your vendor should use the parquet internal compression and not take a parquet 
file and gzip it.

> On 13 Feb 2017, at 18:48, Benjamin Kim <bbuil...@gmail.com> wrote:
> 
> We are receiving files from an outside vendor who creates a Parquet data file 
> and Gzips it before delivery. Does anyone know how to Gunzip the file in 
> Spark and inject the Parquet data into a DataFrame? I thought using 
> sc.textFile or sc.wholeTextFiles would automatically Gunzip the file, but I’m 
> getting a decompression header error when trying to open the Parquet file.
> 
> Thanks,
> Ben
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to