It sounds like you're hitting this:

https://issues.apache.org/jira/browse/HIVE-2395

You might need to patch your version of DeprecatedLzoLineRecordReader
to ignore the .lzo.index files.

-Joey

On Wed, Oct 5, 2011 at 4:13 PM, Jessica Owensby
<[email protected]> wrote:
> Alex,
> The task trackers have been restarted many times across the cluster since
> this issue was first seen.
>
> Hmmm, I hadn't tried to explicitly add the lzo jar to my classpath in the
> hive shell, but I just tried it and got the same errors.
>
> Do you see
>
> /usr/lib/hadoop-0.20/lib/hadoop-lzo-20110217.jar in the child classpath when
>
> the task is executed (use 'ps aux' on the node)?
>
>
> While the job wasn't running, I did this and I got back the tasktracker
> process:  ps aux | grep java | grep lzo.
> Do I have to run this while the task is running on that node?
>
> Joey,
> Yes, the lzo files are indexed.  They are indexed using the following
> command:
>
> hadoop jar /usr/lib/hadoop/lib/hadoop-lzo-20110217.jar
> com.hadoop.compression.lzo.LzoIndexer /user/hive/warehouse/foo/bar.lzo
>
> Jessica
>
> On Wed, Oct 5, 2011 at 3:52 PM, Joey Echeverria <[email protected]> wrote:
>> Are your LZO files indexed?
>>
>> -Joey
>>
>> On Wed, Oct 5, 2011 at 3:35 PM, Jessica Owensby
>> <[email protected]> wrote:
>>> Hi Joey,
>>> Thanks. I forgot to say that; yes, the lzocodec class is listed in
>>> core-site.xml under the io.compression.codecs property:
>>>
>>> <property>
>>>  <name>io.compression.codecs</name>
>>>
>  <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value>
>>> </property>
>>>
>>> I also added the mapred.child.env property to mapred site:
>>>
>>>  <property>
>>>    <name>mapred.child.env</name>
>>>    <value>JAVA_LIBRARY_PATH=/usr/lib/hadoop-0.20/lib</value>
>>>  </property>
>>>
>>> per these instructions:
>>>
> http://www.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/
>>>
>>> After making each of these changes I have restarted the cluster --
>>> just to be sure that the new changes were being picked up.
>>>
>>> Jessica
>>>
>>
>>
>>
>> --
>> Joseph Echeverria
>> Cloudera, Inc.
>> 443.305.9434
>>
>
>
> Adding back the email history:
>
> Hello Everyone,
> I've been having an issue in a hadoop environment (running cdh3u1)
> where any table declared in hive
> with the "STORED AS INPUTFORMAT
> "com.hadoop.mapred.DeprecatedLzoTextInputFormat"" directive has the
> following errors when running any query against it.
>
> For instance, running "select count(*) from foo;" gives the following error:
>
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
>      at
> org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.initNextRecordReader(Hadoop20SShims.java:306)
>      at
> org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.next(Hadoop20SShims.java:209)
>      at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208)
>      at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:193)
>      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>      at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>      at java.security.AccessController.doPrivileged(Native Method)
>      at javax.security.auth.Subject.doAs(Subject.java:396)
>      at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
>      at org.apache.hadoop.mapred.Child.main(Child.java:264)
> Caused by: java.lang.reflect.InvocationTargetException
>      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>      at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>      at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>      at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>      at
> org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.initNextRecordReader(Hadoop20SShims.java:292)
>      ... 11 more
> Caused by: java.io.IOException: No LZO codec found, cannot run.
>      at
> com.hadoop.mapred.DeprecatedLzoLineRecordReader.<init>(DeprecatedLzoLineRecordReader.java:53)
>      at
> com.hadoop.mapred.DeprecatedLzoTextInputFormat.getRecordReader(DeprecatedLzoTextInputFormat.java:128)
>      at
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:68)
>      ... 16 more
>
> java.io.IOException: cannot find class
> com.hadoop.mapred.DeprecatedLzoTextInputFormat
>      at
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:406)
>      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:371)
>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>      at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>      at java.security.AccessController.doPrivileged(Native Method)
>      at javax.security.auth.Subject.doAs(Subject.java:396)
>      at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
>      at org.apache.hadoop.mapred.Child.main(Child.java:264)
>
> My thought is that the hadoop-lzo-20110217.jar is not available on the
> hadoop classpath.  However, the hadoop classpath commnd shows that
> /usr/lib/hadoop-0.20/lib/hadoop-lzo-20110217.jar is in the classpath.
> Additionally, across the cluster on each machine, the
> hadoop-lzo-20110217.jar is present under /usr/lib/hadoop-0.20/lib/.
>
> The hadoop-core-0.20.2-cdh3u1.jar is also available on my hadoop classpath.
>
> What else can I investigate to confirm that the lzo jar is on my
> classpath?  Or is this error indicative of another issue?
>
> Jessica
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Reply via email to