Re: Working on LZOP Files

Andrew Ash Thu, 25 Sep 2014 20:26:30 -0700

Hi Harsha,

I use LZOP files extensively on my Spark cluster -- see my writeup for how
to do this on this mailing list post:
http://mail-archives.apache.org/mod_mbox/spark-user/201312.mbox/%3CCAOoZ679ehwvT1g8=qHd2n11Z4EXOBJkP+q=Aj0qE_=shhyl...@mail.gmail.com%3E

Maybe we should better document how to use LZO with Spark because it can be
tricky to get the lzo jars, native libraries, and hadoopFile() calls all
set up correctly.

Andrew

On Thu, Sep 25, 2014 at 9:44 AM, Harsha HN <99harsha.h....@gmail.com> wrote:

> Hi,
>
> Anybody using LZOP files to process in Spark?
>
> We have a huge volume of LZOP files in HDFS to process through Spark. In
> MapReduce framework, it automatically detects the file format and sends the
> decompressed version to Mappers.
> Any such support in Spark?
> As of now I am manually downloading, decompressing it before processing.
>
> Thanks,
> Harsha
>

Re: Working on LZOP Files

Reply via email to