Hello Everyone,
I've been having an issue in a hadoop environment (running cdh3u1)
where any table declared in hive
with the "STORED AS INPUTFORMAT
"com.hadoop.mapred.DeprecatedLzoTextInputFormat"" directive has the
following errors when running any query against it.
For instance, running "select count(*) from foo;" gives the following error:
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at
org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.initNextRecordReader(Hadoop20SShims.java:306)
at
org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.next(Hadoop20SShims.java:209)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:193)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at
org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.initNextRecordReader(Hadoop20SShims.java:292)
... 11 more
Caused by: java.io.IOException: No LZO codec found, cannot run.
at
com.hadoop.mapred.DeprecatedLzoLineRecordReader.<init>(DeprecatedLzoLineRecordReader.java:53)
at
com.hadoop.mapred.DeprecatedLzoTextInputFormat.getRecordReader(DeprecatedLzoTextInputFormat.java:128)
at
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:68)
... 16 more
java.io.IOException: cannot find class
com.hadoop.mapred.DeprecatedLzoTextInputFormat
at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:406)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:371)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
My thought is that the hadoop-lzo-20110217.jar is not available on the
hadoop classpath. However, the hadoop classpath commnd shows that
/usr/lib/hadoop-0.20/lib/hadoop-lzo-20110217.jar is in the classpath.
Additionally, across the cluster on each machine, the
hadoop-lzo-20110217.jar is present under /usr/lib/hadoop-0.20/lib/.
The hadoop-core-0.20.2-cdh3u1.jar is also available on my hadoop classpath.
What else can I investigate to confirm that the lzo jar is on my
classpath? Or is this error indicative of another issue?
Jessica