Great. Thanks! Will give that a try. Jessica On Wed, Oct 5, 2011 at 4:22 PM, Joey Echeverria <[email protected]> wrote:
> It sounds like you're hitting this: > > https://issues.apache.org/jira/browse/HIVE-2395 > > You might need to patch your version of DeprecatedLzoLineRecordReader > to ignore the .lzo.index files. > > -Joey > > On Wed, Oct 5, 2011 at 4:13 PM, Jessica Owensby > <[email protected]> wrote: > > Alex, > > The task trackers have been restarted many times across the cluster since > > this issue was first seen. > > > > Hmmm, I hadn't tried to explicitly add the lzo jar to my classpath in the > > hive shell, but I just tried it and got the same errors. > > > > Do you see > > > > /usr/lib/hadoop-0.20/lib/hadoop-lzo-20110217.jar in the child classpath > when > > > > the task is executed (use 'ps aux' on the node)? > > > > > > While the job wasn't running, I did this and I got back the tasktracker > > process: ps aux | grep java | grep lzo. > > Do I have to run this while the task is running on that node? > > > > Joey, > > Yes, the lzo files are indexed. They are indexed using the following > > command: > > > > hadoop jar /usr/lib/hadoop/lib/hadoop-lzo-20110217.jar > > com.hadoop.compression.lzo.LzoIndexer /user/hive/warehouse/foo/bar.lzo > > > > Jessica > > > > On Wed, Oct 5, 2011 at 3:52 PM, Joey Echeverria <[email protected]> > wrote: > >> Are your LZO files indexed? > >> > >> -Joey > >> > >> On Wed, Oct 5, 2011 at 3:35 PM, Jessica Owensby > >> <[email protected]> wrote: > >>> Hi Joey, > >>> Thanks. I forgot to say that; yes, the lzocodec class is listed in > >>> core-site.xml under the io.compression.codecs property: > >>> > >>> <property> > >>> <name>io.compression.codecs</name> > >>> > > > > <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value> > >>> </property> > >>> > >>> I also added the mapred.child.env property to mapred site: > >>> > >>> <property> > >>> <name>mapred.child.env</name> > >>> <value>JAVA_LIBRARY_PATH=/usr/lib/hadoop-0.20/lib</value> > >>> </property> > >>> > >>> per these instructions: > >>> > > > http://www.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/ > >>> > >>> After making each of these changes I have restarted the cluster -- > >>> just to be sure that the new changes were being picked up. > >>> > >>> Jessica > >>> > >> > >> > >> > >> -- > >> Joseph Echeverria > >> Cloudera, Inc. > >> 443.305.9434 > >> > > > > > > Adding back the email history: > > > > Hello Everyone, > > I've been having an issue in a hadoop environment (running cdh3u1) > > where any table declared in hive > > with the "STORED AS INPUTFORMAT > > "com.hadoop.mapred.DeprecatedLzoTextInputFormat"" directive has the > > following errors when running any query against it. > > > > For instance, running "select count(*) from foo;" gives the following > error: > > > > java.lang.RuntimeException: java.lang.reflect.InvocationTargetException > > at > > > org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.initNextRecordReader(Hadoop20SShims.java:306) > > at > > > org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.next(Hadoop20SShims.java:209) > > at > > > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208) > > at > > > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:193) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) > > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) > > at org.apache.hadoop.mapred.Child.main(Child.java:264) > > Caused by: java.lang.reflect.InvocationTargetException > > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > > Method) > > at > > > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > > at > > > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > > at > > > org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileRecordReader.initNextRecordReader(Hadoop20SShims.java:292) > > ... 11 more > > Caused by: java.io.IOException: No LZO codec found, cannot run. > > at > > > com.hadoop.mapred.DeprecatedLzoLineRecordReader.<init>(DeprecatedLzoLineRecordReader.java:53) > > at > > > com.hadoop.mapred.DeprecatedLzoTextInputFormat.getRecordReader(DeprecatedLzoTextInputFormat.java:128) > > at > > > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:68) > > ... 16 more > > > > java.io.IOException: cannot find class > > com.hadoop.mapred.DeprecatedLzoTextInputFormat > > at > > > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:406) > > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:371) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) > > at org.apache.hadoop.mapred.Child.main(Child.java:264) > > > > My thought is that the hadoop-lzo-20110217.jar is not available on the > > hadoop classpath. However, the hadoop classpath commnd shows that > > /usr/lib/hadoop-0.20/lib/hadoop-lzo-20110217.jar is in the classpath. > > Additionally, across the cluster on each machine, the > > hadoop-lzo-20110217.jar is present under /usr/lib/hadoop-0.20/lib/. > > > > The hadoop-core-0.20.2-cdh3u1.jar is also available on my hadoop > classpath. > > > > What else can I investigate to confirm that the lzo jar is on my > > classpath? Or is this error indicative of another issue? > > > > Jessica > > > > > > -- > Joseph Echeverria > Cloudera, Inc. > 443.305.9434 >
