Re: Does Pig 0.7 support indexed LZO files? If not, does elephant-pig work with 0.7?

pig Wed, 22 Sep 2010 11:43:07 -0700

Well I thought that would be a simple enough fix but no luck so far.

I've added the elephant-bird/lib directory (which I made world readable and
executable) to the CLASSPATH, JAVA_LIBRARY_PATH and HADOOP_CLASSPATH as both
the user running grunt and the hadoop user. (sort of a shotgun approach)


I still get the error where it complains about nogplcompression and in the
log it has an error where it can't find com.google.common.collect.Maps

Are these two separate problems or is it one problem that is causing two
different errors?  Thank you for the help!

~Ed

On Wed, Sep 22, 2010 at 1:57 PM, Dmitriy Ryaboy <dvrya...@gmail.com> wrote:

> You need the jars in elephant-bird's lib/ on your classpath to run
> Elephant-Bird.
>
>
> On Wed, Sep 22, 2010 at 10:35 AM, pig <hadoopn...@gmail.com> wrote:
>
> > Thank you for pointing out the 0.7 branch.   I'm giving the 0.7 branch a
> > shot and have run into a problem when trying to run the following test
> pig
> > script:
> >
> > REGISTER elephant-bird-1.0.jar
> > A = LOAD '/user/foo/input' USING
> > com.twitter.elephantbird.pig.load.LzoTokenizedLoader('\t');
> > B = LIMIT A 100;
> > DUMP B;
> >
> > When I try to run this I get the following error:
> >
> > java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
> >  ....
> > ERROR com.hadoop.compression.lzo.LzoCodec - Cannot load native-lzo
> without
> > native-hadoop
> > ERROR org.apache.pig.tools.grunt.Grunt - EROR 2999: Unexpected internal
> > error.  could not instantiate
> > 'com.twitter.elephantbird.pig.load.LzoTokenizedLoader' with arguments '[
> > ]'
> >
> > Looking at the log file it gives the following:
> >
> > java.lang.RuntimeException: could not instantiate
> > 'com.twitter.elephantbird.pig.load.LzoTokenizedLoader' with arguments '[
> > ]'
> > ...
> > Caused by: java.lang.reflect.InvocationTargetException
> > ...
> > Caused by: java.lang.NoClassDefFoundError: com/google/common/collect/Maps
> > ...
> > Caused by: java.lang.ClassNotFoundException:
> com.google.common.collect.Maps
> >
> > What is confusing me is that LZO compression and decompression works fine
> > when I'm running a normal java based map reduce program so I feel as
> though
> > the libraries have to be in the right place with the right settings for
> > java.library.path.  Otherwise how would normal java map-reduce work?  Is
> > there some other location I need to set JAVA_LIBRARY_PATH for pig to pick
> > it
> > up?  My understanding was that it would get this from hadoop-env.sh.  Are
> > the missing com.google.common.collect.Maps the real problem here?  Thank
> > you
> > for any help!
> >
> > ~Ed
> >
> > On Tue, Sep 21, 2010 at 5:43 PM, Dmitriy Ryaboy <dvrya...@gmail.com>
> > wrote:
> >
> > > Hi Ed,
> > > Elephant-bird only works with 0.6 at the moment. There's a branch for
> 0.7
> > > that I haven't tested: http://github.com/hirohanin/elephant-bird/
> > > Try it, let me know if it works.
> > >
> > > -D
> > >
> > > On Tue, Sep 21, 2010 at 2:22 PM, pig <hadoopn...@gmail.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > I have a small cluster up and running with LZO compressed files in
> it.
> > >  I'm
> > > > using the lzo compression libraries available at
> > > > http://github.com/kevinweil/hadoop-lzo (thank you for maintaining
> > this!)
> > > >
> > > > So far everything works fine when I write regular map-reduce jobs.  I
> > can
> > > > read in lzo files and write out lzo files without any problem.
> > > >
> > > > I'm also using Pig 0.7 and it appears to be able to read LZO files
> out
> > of
> > > > the box using the default LoadFunc (PigStorage).  However, I am
> > currently
> > > > testing a large LZO file (20GB) which I indexed using the LzoIndexer
> > and
> > > > Pig
> > > > does not appear to be making use of the indexes.  The pig scripts
> that
> > > I've
> > > > run so far only have 3 mappers when processing the 20GB file.  My
> > > > understanding was that there should be 1 map for each block (256MB
> > > blocks)
> > > > so about 80 mappers when processing the 20GB lzo file.  Does Pig 0.7
> > > > support
> > > > indexed lzo files with the default load function?
> > > >
> > > > If not, I was looking at elephant-bird and noticed it is only
> > compatible
> > > > with Pig 0.6 and not 0.7+  Is that accurate?  What would be the
> > > recommended
> > > > solution for processing index lzo files using Pig 0.7.
> > > >
> > > > Thank you for any assistance!
> > > >
> > > > ~Ed
> > > >
> > >
> >
>

Re: Does Pig 0.7 support indexed LZO files? If not, does elephant-pig work with 0.7?

Reply via email to