Great.  Thanks!  Will give that a try.
Jessica

On Wed, Oct 5, 2011 at 4:22 PM, Joey Echeverria <[email protected]> wrote:

> It sounds like you're hitting this:
>
> https://issues.apache.org/jira/browse/HIVE-2395
>
> You might need to patch your version of DeprecatedLzoLineRecordReader
> to ignore the .lzo.index files.
>
> -Joey
>
> On Wed, Oct 5, 2011 at 4:13 PM, Jessica Owensby
> <[email protected]> wrote:
> > Alex,
> > The task trackers have been restarted many times across the cluster since
> > this issue was first seen.
> >
> > Hmmm, I hadn't tried to explicitly add the lzo jar to my classpath in the
> > hive shell, but I just tried it and got the same errors.
> >
> > Do you see
> >
> > /usr/lib/hadoop-0.20/lib/hadoop-lzo-20110217.jar in the child classpath
> when
> >
> > the task is executed (use 'ps aux' on the node)?
> >
> >
> > While the job wasn't running, I did this and I got back the tasktracker
> > process:  ps aux | grep java | grep lzo.
> > Do I have to run this while the task is running on that node?
> >
> > Joey,
> > Yes, the lzo files are indexed.  They are indexed using the following
> > command:
> >
> > hadoop jar /usr/lib/hadoop/lib/hadoop-lzo-20110217.jar
> > com.hadoop.compression.lzo.LzoIndexer /user/hive/warehouse/foo/bar.lzo
> >
> > Jessica
> >
> > On Wed, Oct 5, 2011 at 3:52 PM, Joey Echeverria <[email protected]>
> wrote:
> >> Are your LZO files indexed?
> >>
> >> -Joey
> >>
> >> On Wed, Oct 5, 2011 at 3:35 PM, Jessica Owensby
> >> <[email protected]> wrote:
> >>> Hi Joey,
> >>> Thanks. I forgot to say that; yes, the lzocodec class is listed in
> >>> core-site.xml under the io.compression.codecs property:
> >>>
> >>> <property>
> >>>  <name>io.compression.codecs</name>
> >>>
> >
>  
> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value>
> >>> </property>
> >>>
> >>> I also added the mapred.child.env property to mapred site:
> >>>
> >>>  <property>
> >>>    <name>mapred.child.env</name>
> >>>    <value>JAVA_LIBRARY_PATH=/usr/lib/hadoop-0.20/lib</value>
> >>>  </property>
> >>>
> >>> per these instructions:
> >>>
> >
> http://www.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/
> >>>
> >>> After making each of these changes I have restarted the cluster --
> >>> just to be sure that the new changes were being picked up.
> >>>
> >>> Jessica
> >>>
> >>
> >>
> >>
> >> --
> >> Joseph Echeverria
> >> Cloudera, Inc.
> >> 443.305.9434
> >>
> >
> >
> > Adding back the email history:
> >
> > Hello Everyone,
> > I've been having an issue in a hadoop environment (running cdh3u1)
> > where any table declared in hive
> > with the "STORED AS INPUTFORMAT
> > "com.hadoop.mapred.DeprecatedLzoTextInputFormat"" directive has the
> > following errors when running any query against it.
> >
> > For instance, running "select count(*) from foo;" gives the following
> error:
> >
> > java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
> >      at
> >
> org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.initNextRecordReader(Hadoop20SShims.java:306)
> >      at
> >
> org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.next(Hadoop20SShims.java:209)
> >      at
> >
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208)
> >      at
> >
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:193)
> >      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
> >      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
> >      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
> >      at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> >      at java.security.AccessController.doPrivileged(Native Method)
> >      at javax.security.auth.Subject.doAs(Subject.java:396)
> >      at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
> >      at org.apache.hadoop.mapred.Child.main(Child.java:264)
> > Caused by: java.lang.reflect.InvocationTargetException
> >      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > Method)
> >      at
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> >      at
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> >      at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> >      at
> >
> org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.initNextRecordReader(Hadoop20SShims.java:292)
> >      ... 11 more
> > Caused by: java.io.IOException: No LZO codec found, cannot run.
> >      at
> >
> com.hadoop.mapred.DeprecatedLzoLineRecordReader.<init>(DeprecatedLzoLineRecordReader.java:53)
> >      at
> >
> com.hadoop.mapred.DeprecatedLzoTextInputFormat.getRecordReader(DeprecatedLzoTextInputFormat.java:128)
> >      at
> >
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:68)
> >      ... 16 more
> >
> > java.io.IOException: cannot find class
> > com.hadoop.mapred.DeprecatedLzoTextInputFormat
> >      at
> >
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:406)
> >      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:371)
> >      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
> >      at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> >      at java.security.AccessController.doPrivileged(Native Method)
> >      at javax.security.auth.Subject.doAs(Subject.java:396)
> >      at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
> >      at org.apache.hadoop.mapred.Child.main(Child.java:264)
> >
> > My thought is that the hadoop-lzo-20110217.jar is not available on the
> > hadoop classpath.  However, the hadoop classpath commnd shows that
> > /usr/lib/hadoop-0.20/lib/hadoop-lzo-20110217.jar is in the classpath.
> > Additionally, across the cluster on each machine, the
> > hadoop-lzo-20110217.jar is present under /usr/lib/hadoop-0.20/lib/.
> >
> > The hadoop-core-0.20.2-cdh3u1.jar is also available on my hadoop
> classpath.
> >
> > What else can I investigate to confirm that the lzo jar is on my
> > classpath?  Or is this error indicative of another issue?
> >
> > Jessica
> >
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>

Reply via email to