Re: How to read LZO compressed files?

edward choi Mon, 02 Jan 2012 00:02:56 -0800

Harsh, your comment just saved me from several wasteful hours of aimless
labor.
I added LzoCodec in core-site.xml. But I forgot to add LzopCodec.
Now it works all good. Thanks for the reply!!!


Regards,
Ed

2012/1/2 Harsh J <[email protected]>

> Hello Edward,
>
> On Mon, Jan 2, 2012 at 11:04 AM, edward choi <[email protected]> wrote:
> > Hi,
> >
> > I'm having trouble trying to handle lzo compressed files.
> > The input files are compressed by LzopCodec provided by hadoop-lzo
> package.
> > And I am using Cloudera 3 update 2 version Hadoop.
> >
> > I don't need to split the input file, so there is no need telling me to
> > index the input file and to use LzoTextInputFormat, unless that is the
> only
> > way to handle lzo-compressed files.
>
> Its OK to use LZO without splitting. There are no issues in doing that.
>
> > I thought all I needed to do was set the job input format as
> > "TextInputFormat" and hadoop will take care of the rest.
> > When I do that, I don't get any error messages but log files tell me that
> > input files are not decompressed at all. Input files are being handled as
> > raw text files.
>
> By 'Input files are being handled as raw text files.' I assume you
> mean that your mappers are receiving garbage (compressed) input,
> without being decoded?
>
> Have you ensured that your io.compression.codecs property in
> core-site.xml carries LzoCodec and LzopCodec canonical classnames, and
> that your MR cluster was restarted with this change added?
>
> > Is there a specific way to read files with lzo extension?
>
> The above config registers ".lzo" look-outs and auto-detection of LZO
> files so you shouldn't need an explicit way.
>
> --
> Harsh J
>

Re: How to read LZO compressed files?

Reply via email to