If the jar is included in HADOOP_CLASSPATH in your config but you're still
getting ClassNotFound, you might try printing out the map task environment.
You can do this easily by running a hadoop streaming job with mapper set to be
printenv or something similar to print the environment to the output path. If
the environment does not match the config, it often means that the TaskTrackers
were not restarted after the config was edited.
> Date: Sun, 21 Nov 2010 21:48:14 +0530
> Subject: Re: ClassNotFoundException while running some HBase m/r jobs
> From: [email protected]
> To: [email protected]
>
> Hi Ted,
> I tried doing the same thing with ant. And it worked! Thanks guys!
> But what I have now is a 26 MB fat jar file, since I included all jars I
> was including the the classpathref (shown above in ant script). Is there any
> other way I can get it to work? What is the root cause of this problem? This
> solution works but looks very unclean. Ideally, the jar files should get
> found, right? Or is this meant to be used like this by design?
>
> thanks,
> hari
>
> On Sun, Nov 21, 2010 at 8:55 PM, Ted Yu <[email protected]> wrote:
>
> > We package hbase class files (using maven) into our jar. E.g.
> > [had...@us01-ciqps1-name01 sims]$ jar tvf lib/flow-3.0.0.54-294813.jar |
> > grep hbase
> > ...
> > 6743 Sat Jul 03 09:17:38 GMT 2010
> > org/apache/hadoop/hbase/thrift/ThriftUtilities.class
> > 24472 Sat Jul 03 09:17:38 GMT 2010
> > org/apache/hadoop/hbase/thrift/ThriftServer$HBaseHandler.class
> > 3897 Sat Jul 03 09:17:38 GMT 2010
> > org/apache/hadoop/hbase/thrift/ThriftServer.class
> > 565 Sat Jul 03 07:16:26 GMT 2010
> > org/apache/hadoop/hbase/TableNotFoundException.class
> > 2306 Sat Jul 03 07:16:26 GMT 2010
> > org/apache/hadoop/hbase/HStoreKey$StoreKeyComparator.class
> > 722 Sat Jul 03 07:16:22 GMT 2010
> > org/apache/hadoop/hbase/DoNotRetryIOException.class
> >
> > FYI
> >
> > On Sun, Nov 21, 2010 at 7:18 AM, Hari Sreekumar <[email protected]
> > >wrote:
> >
> > > Hi Ted,
> > >
> > > Sure.. I use this command:
> > > $HADOOP_HOME/bin/hadoop jar ~/MRJobs.jar BulkUpload /tmp/customerData.dat
> > >
> > > /tmp/customerData.dat is the argument (text file from which data is to be
> > > uploaded) and BulkUpload is the class name.
> > >
> > > thanks,
> > > hari
> > >
> > > On Sun, Nov 21, 2010 at 8:35 PM, Ted Yu <[email protected]> wrote:
> > >
> > > > Can you show us the command which you use to launch the M/R job ?
> > > >
> > > > Thanks
> > > >
> > > > On Sun, Nov 21, 2010 at 5:26 AM, Hari Sreekumar <
> > > [email protected]
> > > > >wrote:
> > > >
> > > > > Hey Lars,
> > > > > You mean copying all required jar files to the lib/ folder in each
> > > jar?
> > > > > Is it worth the redundancy? I'll check if it works if I do that.
> > > > Currently,
> > > > > I am using ant to build my jar file with these instructions to
> > include
> > > > > files:
> > > > >
> > > > > <path id="classpath">
> > > > > <fileset dir="${lib.dir}" includes="**/*.jar"/>
> > > > > <fileset dir="${env.HADOOP_HOME}" includes="*.jar"/>
> > > > > <fileset dir="${env.HBASE_HOME}" includes="*.jar"/>
> > > > > <fileset dir="${env.HADOOP_HOME}/lib" includes="**/*.jar"/>
> > > > > <fileset dir="${env.HBASE_HOME}/lib" includes="**/*.jar"/>
> > > > > </path>
> > > > >
> > > > > <target name="compile" depends="clean">
> > > > > <mkdir dir="${build.dir}"/>
> > > > > <javac srcdir="${src.dir}" destdir="${build.dir}"
> > > > > classpathref="classpath"/>
> > > > > <copy todir="${build.dir}">
> > > > > <fileset dir="${input.dir}" includes="*.*"/>
> > > > > </copy>
> > > > > </target>
> > > > >
> > > > > I'll try copying all jars into the lib and including only the lib
> > > folder
> > > > > now.
> > > > >
> > > > > thanks,
> > > > > hari
> > > > >
> > > > > On Sun, Nov 21, 2010 at 5:32 PM, Lars George <[email protected]>
> > > > > wrote:
> > > > >
> > > > > > Hi Hari,
> > > > > >
> > > > > > I would try the "fat" jar approach. It is much easier to maintain
> > as
> > > > > > each job jar contains its required dependencies. Adding it to the
> > > > > > nodes and config is becoming a maintenance nightmare very quickly.
> > I
> > > > > > am personally using Maven to build my job jars and the Maven
> > > "Package"
> > > > > > plugin that has a custom package descriptor which - upon building
> > the
> > > > > > project - wraps everything up for me in one fell swoop.
> > > > > >
> > > > > > Lars
> > > > > >
> > > > > > On Sun, Nov 21, 2010 at 8:17 AM, Hari Sreekumar
> > > > > > <[email protected]> wrote:
> > > > > > > Hi Lars,
> > > > > > >
> > > > > > > I tried copying conf to all nodes and copying jar, it is
> > > still
> > > > > > > giving the same error. Weird thing is that tasks on the master
> > node
> > > > are
> > > > > > also
> > > > > > > failing with the same error, even though all my files are
> > available
> > > > on
> > > > > > > master. I am sure I'm missing something basic here, but unable to
> > > > > > pinpoint
> > > > > > > the exact problem.
> > > > > > >
> > > > > > > hari
> > > > > > >
> > > > > > > On Sun, Nov 21, 2010 at 3:11 AM, Lars George <
> > > [email protected]>
> > > > > > wrote:
> > > > > > >
> > > > > > >> Hi Hari,
> > > > > > >>
> > > > > > >> This is most certainly a classpath issue. You either have to add
> > > the
> > > > > jar
> > > > > > to
> > > > > > >> all TaskTracker servers and add it into the hadoop-env.sh in the
> > > > > > >> HADOOP_CLASSPATH line (and copy it to all servers again *and*
> > > > restart
> > > > > > the
> > > > > > >> TaskTracker process!) or put the jar into the job jar into a
> > /lib
> > > > > > directory.
> > > > > > >>
> > > > > > >> Lars
> > > > > > >>
> > > > > > >> On Nov 20, 2010, at 22:33, Hari Sreekumar <
> > > [email protected]
> > > > >
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > Hi,
> > > > > > >> >
> > > > > > >> > I am getting this exception while running m/r jobs on HBase:
> > > > > > >> >
> > > > > > >> > 10/11/21 02:53:01 INFO input.FileInputFormat: Total input
> > paths
> > > to
> > > > > > >> process :
> > > > > > >> > 1
> > > > > > >> > 10/11/21 02:53:01 INFO mapred.JobClient: Running job:
> > > > > > >> job_201011210240_0002
> > > > > > >> > 10/11/21 02:53:02 INFO mapred.JobClient: map 0% reduce 0%
> > > > > > >> > 10/11/21 02:53:08 INFO mapred.JobClient: Task Id :
> > > > > > >> > attempt_201011210240_0002_m_000036_0, Status : FAILED
> > > > > > >> > java.lang.RuntimeException: java.lang.ClassNotFoundException:
> > > > > > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> > > > > > >> > at
> > > > > > >> >
> > > > > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
> > > > > > >> > at
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193)
> > > > > > >> > at
> > > org.apache.hadoop.mapred.Task.initialize(Task.java:413)
> > > > > > >> > at
> > org.apache.hadoop.mapred.MapTask.run(MapTask.java:288)
> > > > > > >> > at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > >> > Caused by: java.lang.ClassNotFoundException:
> > > > > > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> > > > > > >> > at
> > java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> > > > > > >> > at java.security.AccessController.doPrivileged(Native
> > > > Method)
> > > > > > >> > at
> > > > java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > > > > > >> > at
> > java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> > > > > > >> > at
> > > > > > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> > > > > > >> > at
> > java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> > > > > > >> > at java.lang.Class.forName0(Native Method)
> > > > > > >> > at java.lang.Class.forName(Class.java:247)
> > > > > > >> > at
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
> > > > > > >> > at
> > > > > > >> >
> > > > > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
> > > > > > >> > ... 4 more
> > > > > > >> >
> > > > > > >> > What could be the probable reasons for this? I have made sure
> > > that
> > > > > > >> > hbase-0.20.6.jar, which contains this particular class, is
> > > > included
> > > > > in
> > > > > > >> the
> > > > > > >> > class path. In fact, if I run non-m/r jobs, it works fine.
> > e.g,
> > > I
> > > > > ran
> > > > > > a
> > > > > > >> jar
> > > > > > >> > file successfully that uses HAdmin to create some tables. Here
> > > is
> > > > a
> > > > > > part
> > > > > > >> of
> > > > > > >> > the output from these jobs:
> > > > > > >> >
> > > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > > >> > environment:java.vendor=Sun Microsystems Inc.
> > > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > > >> > environment:java.home=/usr/java/jdk1.6.0_22/jre
> > > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > environment:java.class.path=/opt/hadoop/bin/../conf:/usr/java/jdk1.6.0_22/lib/tools.jar:/opt/hadoop/bin/..
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > :/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/com
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > mons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-a
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > pi-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/..
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > /lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hbase/hbase-0.20.6.jar:/opt/hbase/hbase-0.20.6-test.jar:/opt/hbase/conf:/opt/hbase/lib/zookeeper-3.2.2.jar
> > > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > environment:java.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64
> > > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > > >> > environment:java.io.tmpdir=/tmp
> > > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > > >> > environment:java.compiler=<NA>
> > > > > > >> >
> > > > > > >> > As you can see, /opt/hbase/hbase-0.20.6.jar is included in the
> > > > > > classpath.
> > > > > > >> > What else could be it?
> > > > > > >> >
> > > > > > >> > thanks,
> > > > > > >> > hari
> > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >