Re: cannot find DeprecatedLzoTextInputFormat

Jessica Owensby Wed, 05 Oct 2011 13:13:54 -0700

Alex,
The task trackers have been restarted many times across the cluster since
this issue was first seen.


Hmmm, I hadn't tried to explicitly add the lzo jar to my classpath in the
hive shell, but I just tried it and got the same errors.

Do you see

/usr/lib/hadoop-0.20/lib/hadoop-lzo-20110217.jar in the child classpath when

the task is executed (use 'ps aux' on the node)?


While the job wasn't running, I did this and I got back the tasktracker
process:  ps aux | grep java | grep lzo.
Do I have to run this while the task is running on that node?

Joey,
Yes, the lzo files are indexed.  They are indexed using the following
command:

hadoop jar /usr/lib/hadoop/lib/hadoop-lzo-20110217.jar
com.hadoop.compression.lzo.LzoIndexer /user/hive/warehouse/foo/bar.lzo

Jessica

On Wed, Oct 5, 2011 at 3:52 PM, Joey Echeverria <[email protected]> wrote:
> Are your LZO files indexed?
>
> -Joey
>
> On Wed, Oct 5, 2011 at 3:35 PM, Jessica Owensby
> <[email protected]> wrote:
>> Hi Joey,
>> Thanks. I forgot to say that; yes, the lzocodec class is listed in
>> core-site.xml under the io.compression.codecs property:
>>
>> <property>
>>  <name>io.compression.codecs</name>
>>
 
<value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value>
>> </property>
>>
>> I also added the mapred.child.env property to mapred site:
>>
>>  <property>
>>    <name>mapred.child.env</name>
>>    <value>JAVA_LIBRARY_PATH=/usr/lib/hadoop-0.20/lib</value>
>>  </property>
>>
>> per these instructions:
>>
http://www.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/
>>
>> After making each of these changes I have restarted the cluster --
>> just to be sure that the new changes were being picked up.
>>
>> Jessica
>>
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>


Adding back the email history:

Hello Everyone,
I've been having an issue in a hadoop environment (running cdh3u1)
where any table declared in hive
with the "STORED AS INPUTFORMAT
"com.hadoop.mapred.DeprecatedLzoTextInputFormat"" directive has the
following errors when running any query against it.

For instance, running "select count(*) from foo;" gives the following error:

java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
      at
org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.initNextRecordReader(Hadoop20SShims.java:306)
      at
org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.next(Hadoop20SShims.java:209)
      at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208)
      at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:193)
      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
      at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:396)
      at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
      at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: java.lang.reflect.InvocationTargetException
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
      at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
      at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
      at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
      at
org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.initNextRecordReader(Hadoop20SShims.java:292)
      ... 11 more
Caused by: java.io.IOException: No LZO codec found, cannot run.
      at
com.hadoop.mapred.DeprecatedLzoLineRecordReader.<init>(DeprecatedLzoLineRecordReader.java:53)
      at
com.hadoop.mapred.DeprecatedLzoTextInputFormat.getRecordReader(DeprecatedLzoTextInputFormat.java:128)
      at
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:68)
      ... 16 more

java.io.IOException: cannot find class
com.hadoop.mapred.DeprecatedLzoTextInputFormat
      at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:406)
      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:371)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
      at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:396)
      at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
      at org.apache.hadoop.mapred.Child.main(Child.java:264)

My thought is that the hadoop-lzo-20110217.jar is not available on the
hadoop classpath.  However, the hadoop classpath commnd shows that
/usr/lib/hadoop-0.20/lib/hadoop-lzo-20110217.jar is in the classpath.
Additionally, across the cluster on each machine, the
hadoop-lzo-20110217.jar is present under /usr/lib/hadoop-0.20/lib/.

The hadoop-core-0.20.2-cdh3u1.jar is also available on my hadoop classpath.

What else can I investigate to confirm that the lzo jar is on my
classpath?  Or is this error indicative of another issue?

Jessica

Re: cannot find DeprecatedLzoTextInputFormat

Reply via email to