King Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0
I hope this helps ./g Where to get Hadoop LZO https://github.com/twitter/hadoop-lzo http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo -compression.html Requirements On cents: sudo yum install lzo* --> /usr/lib64/liblzo2.so.2.. On ubuntu: sudo apt-get install liblzo --> on X86: /usr/lib64/liblzo2.so.2 Clone: git clone https://github.com/twitter/hadoop-lzo.git Follow instructions on README.md from this github site, basically cd hadoop-lzo mvn clean package test To enable this at run time do: a. Copy the library to the hadoop/share/common (if you don't want to modify classpaths by putting the library somewhere else) cp lzo././target/hadoop-lzo-0.4.20-SNAPSHOT.jar .. hadoop/share/hadoop/common/ a. Copy /usr/lib64/liblzo2.so.2 to .. Hadoop/lib/native/ From: Gordon Wang [mailto:[email protected]] Sent: Thursday, March 06, 2014 11:50 PM To: [email protected] Subject: Re: MR2 Job over LZO data You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0. In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0 On Thu, Mar 6, 2014 at 6:29 PM, KingDavies <[email protected]> wrote: Running on Hadoop 2.2.0 The Java MR2 job works as expected on an uncompressed data source using the TextInputFormat.class. But when using the LZO format the job fails: import com.hadoop.mapreduce.LzoTextInputFormat; job.setInputFormatClass(LzoTextInputFormat.class); Dependencies from the maven repository: http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/ Also tried with elephant-bird-core 4.4 The same data can be queried fine from within Hive(0.12) on the same cluster. The exception: Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:6 2) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFor mat.java:340) at com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:10 1) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:49 1) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java :392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja va:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at com.cloudreach.DataQuality.Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) I believe the issue is related to the changes in Hadoop 2, but where can I find a H2 compatible version? Thanks -- Regards Gordon Wang
