Re: Does Pig 0.7 support indexed LZO files? If not, does elephant-pig work with 0.7?

Dmitriy Ryaboy Tue, 21 Sep 2010 14:44:13 -0700

Hi Ed,
Elephant-bird only works with 0.6 at the moment. There's a branch for 0.7
that I haven't tested: http://github.com/hirohanin/elephant-bird/
Try it, let me know if it works.


-D

On Tue, Sep 21, 2010 at 2:22 PM, pig <hadoopn...@gmail.com> wrote:

> Hello,
>
> I have a small cluster up and running with LZO compressed files in it.  I'm
> using the lzo compression libraries available at
> http://github.com/kevinweil/hadoop-lzo (thank you for maintaining this!)
>
> So far everything works fine when I write regular map-reduce jobs.  I can
> read in lzo files and write out lzo files without any problem.
>
> I'm also using Pig 0.7 and it appears to be able to read LZO files out of
> the box using the default LoadFunc (PigStorage).  However, I am currently
> testing a large LZO file (20GB) which I indexed using the LzoIndexer and
> Pig
> does not appear to be making use of the indexes.  The pig scripts that I've
> run so far only have 3 mappers when processing the 20GB file.  My
> understanding was that there should be 1 map for each block (256MB blocks)
> so about 80 mappers when processing the 20GB lzo file.  Does Pig 0.7
> support
> indexed lzo files with the default load function?
>
> If not, I was looking at elephant-bird and noticed it is only compatible
> with Pig 0.6 and not 0.7+  Is that accurate?  What would be the recommended
> solution for processing index lzo files using Pig 0.7.
>
> Thank you for any assistance!
>
> ~Ed
>

Re: Does Pig 0.7 support indexed LZO files? If not, does elephant-pig work with 0.7?

Reply via email to