Hi Alex, There seems to be a problem with your configuration. Can you try and check on which node the attempt_201007202234_0001_m_000000_0 was run? Go to this machine and check the config files and the lib subdirectory (for the presence of the correct configuration and the hadoop-lzo-0.4.4.jar). Restart the TT and using 'ps -aef | grep -i tasktracker' check that the hadoop-lzo-0.4.4.jar is in the classpath.
I have a strong suspicion you have stray config files: com.hadoop.compression.lzo.LzopCodec is not mentioned in the ones you provided... Alex K On Wed, Jul 28, 2010 at 7:42 AM, Alex Luya <alexander.l...@gmail.com> wrote: > Hello: > I got source code from http://github.com/kevinweil/hadoop-lzo,compiled > them successfully,and then > 1,copy hadoop-lzo-0.4.4.jar to directory:$HADOOP_HOME/lib of each master > and > slave > 2,Copy all files under directory:../Linux-amd64-64/lib to directory: > $HADDOOP_HOME/lib/native/Linux-amd64-64 of each master and slave > 3,and upload a file:test.lzo to HDFS > 4,then run:hadoop jar $HADOOP_HOME/lib/hadoop-lzo-0.4.4.jar > com.hadoop.compression.lzo.DistributedLzoIndexer test.lzo to test > > got errors: > > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > 10/07/20 22:37:37 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library > 10/07/20 22:37:37 INFO lzo.LzoCodec: Successfully loaded & initialized > native- > lzo library [hadoop-lzo rev 5c25e0073d3dae9ace4bd9eba72e4dc43650c646] > ##########^_^^_^^_^^_^^_^^_^################## > (I think this says all native library got loaded successfully) > ################################ > 10/07/20 22:37:37 INFO lzo.DistributedLzoIndexer: Adding LZO file > target.lz:o > to indexing list (no index currently exists) > ... > attempt_201007202234_0001_m_000000_0, Status : FAILED > java.lang.IllegalArgumentException: Compression codec > com.hadoop.compression.lzo.LzopCodec > not found. > at > > org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96) > at > > org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:134) > at > > com.hadoop.mapreduce.LzoSplitRecordReader.initialize(LzoSplitRecordReader.java:48) > at > > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > Caused by: java.lang.ClassNotFoundException: > com.hadoop.compression.lzo.LzopCodec > > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > at java.lang.ClassLoader.loadClass(ClassLoader.java:307) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:248) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:247) > at > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762) > at > > org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89) > ... 6 more > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > > There is a installation instruction in this > link:http://github.com/kevinweil/hadoop-lzo,it says other configurings are > needed : > > Once the libs are built and installed, you may want to add them to the > class > paths and library paths. That is, in hadoop-env.sh, set > > (1)export HADOOP_CLASSPATH=/path/to/your/hadoop-lzo-lib.jar > > Question:I have copied hadoop-lzo-0.4.4.jar to $HADOOP_HOME/lib, > ,should I do set this entry like this again? actually, after I add this: > export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.20.4.jar: > $HABSE_HOME/config:$ZOOKEEPER_HOME/zookeeper-3.3.1.jar:$HADOOP_HOME/lib > /hadoop-lzo-0.4.4.jar,redo 1-4 as above,same problem as before,so: > how can I > get hadoop to load hadoop-lzo-0.4.4.jar?) > > > (2),export JAVA_LIBRARY_PATH=/path/to/hadoop-lzo-native- > libs:/path/to/standard-hadoop-native-libs > Note that there seems to be a bug in /path/to/hadoop/bin/hadoop; comment > out the line > (3)JAVA_LIBRARY_PATH='' > > > Question:since native library got loaded successfully,aren't these > operation(2)(3) needed? > > > ----------------------------------------------- > I am using hadoop 0.20.2 > core-site.xml > > ----------------------------------------------------------------------------- > <configuration> > <property> > <name>fs.default.name</name> > <value>hdfs://hadoop:8020</value> > </property> > <property> > <name>hadoop.tmp.dir</name> > <value>/home/hadoop/tmp</value> > </property> > > <property> > <name>io.compression.codecs</name> > > > <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec > </value> > </property> > <property> > <name>io.compression.codec.lzo.class</name> > <value>com.hadoop.compression.lzo.LzoCodec</value> > </property> > </configuration> > > > ----------------------------------------------------------------------------- > mapred-site.xml > > ----------------------------------------------------------------------------- > <configuration> > <property> > <name>mapred.job.tracker</name> > <value>AlexLuya:9001</value> > </property> > <property> > <name>mapred.tasktracker.reduce.tasks.maximum</name> > <value>1</value> > </property> > <property> > <name>mapred.tasktracker.map.tasks.maximum</name> > <value>1</value> > </property> > <property> > <name>mapred.local.dir</name> > <value>/home/alex/hadoop/mapred/local</value> > </property> > <property> > <name>mapred.system.dir</name> > <value>/tmp/hadoop/mapred/system</value> > </property> > <property> > <name>mapreduce.map.output.compress</name> > <value>true</value> > </property> > <property> > <name>mapreduce.map.output.compress.codec</name> > <value>com.hadoop.compression.lzo.LzoCodec</value> > </property> > </configuration> > > > ----------------------------------------------------------------------------- > hadoop-env.sh > > ----------------------------------------------------------------------------- > # Set Hadoop-specific environment variables here. > > # The only required environment variable is JAVA_HOME. All others are > # optional. When running a distributed configuration it is best to > # set JAVA_HOME in this file, so that it is correctly defined on > # remote nodes. > > # The java implementation to use. Required. > export JAVA_HOME=/usr/local/hadoop/jdk1.6.0_20 > > # Extra Java CLASSPATH elements. Optional. > # export HADOOP_CLASSPATH= > > # The maximum amount of heap to use, in MB. Default is 1000. > export HADOOP_HEAPSIZE=200 > > # Extra Java runtime options. Empty by default. > #export HADOOP_OPTS=-server > > export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.20.4.jar: > > $HABSE_HOME/config:$ZOOKEEPER_HOME/zookeeper-3.3.1.jar:$HADOOP_HOME/lib/hadoop- > lzo-0.4.4.jar > #export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native/Linux-amd64-64 > > # Command specific options appended to HADOOP_OPTS when specified > export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote > $HADOOP_NAMENODE_OPTS" > export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote > $HADOOP_SECONDARYNAMENODE_OPTS" > export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote > $HADOOP_DATANODE_OPTS" > export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote > $HADOOP_BALANCER_OPTS" > export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote > $HADOOP_JOBTRACKER_OPTS" > # export HADOOP_TASKTRACKER_OPTS= > # The following applies to multiple commands (fs, dfs, fsck, distcp etc) > # export HADOOP_CLIENT_OPTS > > # Extra ssh options. Empty by default. > # export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR" > > # Where log files are stored. $HADOOP_HOME/logs by default. > # export HADOOP_LOG_DIR=${HADOOP_HOME}/logs > > # File naming remote slave hosts. $HADOOP_HOME/conf/slaves by default. > # export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves > > # host:path where hadoop code should be rsync'd from. Unset by default. > # export HADOOP_MASTER=master:/home/$USER/src/hadoop > > # Seconds to sleep between slave commands. Unset by default. This > # can be useful in large clusters, where, e.g., slave rsyncs can > # otherwise arrive faster than the master can service them. > # export HADOOP_SLAVE_SLEEP=0.1 > > # The directory where pid files are stored. /tmp by default. > # export HADOOP_PID_DIR=/var/hadoop/pids > > # A string representing this instance of hadoop. $USER by default. > #export HADOOP_IDENT_STRING=$USER > > # The scheduling priority for daemon processes. See 'man nice'. > # export HADOOP_NICENESS=10 >