We package hbase class files (using maven) into our jar. E.g. [had...@us01-ciqps1-name01 sims]$ jar tvf lib/flow-3.0.0.54-294813.jar | grep hbase ... 6743 Sat Jul 03 09:17:38 GMT 2010 org/apache/hadoop/hbase/thrift/ThriftUtilities.class 24472 Sat Jul 03 09:17:38 GMT 2010 org/apache/hadoop/hbase/thrift/ThriftServer$HBaseHandler.class 3897 Sat Jul 03 09:17:38 GMT 2010 org/apache/hadoop/hbase/thrift/ThriftServer.class 565 Sat Jul 03 07:16:26 GMT 2010 org/apache/hadoop/hbase/TableNotFoundException.class 2306 Sat Jul 03 07:16:26 GMT 2010 org/apache/hadoop/hbase/HStoreKey$StoreKeyComparator.class 722 Sat Jul 03 07:16:22 GMT 2010 org/apache/hadoop/hbase/DoNotRetryIOException.class
FYI On Sun, Nov 21, 2010 at 7:18 AM, Hari Sreekumar <[email protected]>wrote: > Hi Ted, > > Sure.. I use this command: > $HADOOP_HOME/bin/hadoop jar ~/MRJobs.jar BulkUpload /tmp/customerData.dat > > /tmp/customerData.dat is the argument (text file from which data is to be > uploaded) and BulkUpload is the class name. > > thanks, > hari > > On Sun, Nov 21, 2010 at 8:35 PM, Ted Yu <[email protected]> wrote: > > > Can you show us the command which you use to launch the M/R job ? > > > > Thanks > > > > On Sun, Nov 21, 2010 at 5:26 AM, Hari Sreekumar < > [email protected] > > >wrote: > > > > > Hey Lars, > > > You mean copying all required jar files to the lib/ folder in each > jar? > > > Is it worth the redundancy? I'll check if it works if I do that. > > Currently, > > > I am using ant to build my jar file with these instructions to include > > > files: > > > > > > <path id="classpath"> > > > <fileset dir="${lib.dir}" includes="**/*.jar"/> > > > <fileset dir="${env.HADOOP_HOME}" includes="*.jar"/> > > > <fileset dir="${env.HBASE_HOME}" includes="*.jar"/> > > > <fileset dir="${env.HADOOP_HOME}/lib" includes="**/*.jar"/> > > > <fileset dir="${env.HBASE_HOME}/lib" includes="**/*.jar"/> > > > </path> > > > > > > <target name="compile" depends="clean"> > > > <mkdir dir="${build.dir}"/> > > > <javac srcdir="${src.dir}" destdir="${build.dir}" > > > classpathref="classpath"/> > > > <copy todir="${build.dir}"> > > > <fileset dir="${input.dir}" includes="*.*"/> > > > </copy> > > > </target> > > > > > > I'll try copying all jars into the lib and including only the lib > folder > > > now. > > > > > > thanks, > > > hari > > > > > > On Sun, Nov 21, 2010 at 5:32 PM, Lars George <[email protected]> > > > wrote: > > > > > > > Hi Hari, > > > > > > > > I would try the "fat" jar approach. It is much easier to maintain as > > > > each job jar contains its required dependencies. Adding it to the > > > > nodes and config is becoming a maintenance nightmare very quickly. I > > > > am personally using Maven to build my job jars and the Maven > "Package" > > > > plugin that has a custom package descriptor which - upon building the > > > > project - wraps everything up for me in one fell swoop. > > > > > > > > Lars > > > > > > > > On Sun, Nov 21, 2010 at 8:17 AM, Hari Sreekumar > > > > <[email protected]> wrote: > > > > > Hi Lars, > > > > > > > > > > I tried copying conf to all nodes and copying jar, it is > still > > > > > giving the same error. Weird thing is that tasks on the master node > > are > > > > also > > > > > failing with the same error, even though all my files are available > > on > > > > > master. I am sure I'm missing something basic here, but unable to > > > > pinpoint > > > > > the exact problem. > > > > > > > > > > hari > > > > > > > > > > On Sun, Nov 21, 2010 at 3:11 AM, Lars George < > [email protected]> > > > > wrote: > > > > > > > > > >> Hi Hari, > > > > >> > > > > >> This is most certainly a classpath issue. You either have to add > the > > > jar > > > > to > > > > >> all TaskTracker servers and add it into the hadoop-env.sh in the > > > > >> HADOOP_CLASSPATH line (and copy it to all servers again *and* > > restart > > > > the > > > > >> TaskTracker process!) or put the jar into the job jar into a /lib > > > > directory. > > > > >> > > > > >> Lars > > > > >> > > > > >> On Nov 20, 2010, at 22:33, Hari Sreekumar < > [email protected] > > > > > > > >> wrote: > > > > >> > > > > >> > Hi, > > > > >> > > > > > >> > I am getting this exception while running m/r jobs on HBase: > > > > >> > > > > > >> > 10/11/21 02:53:01 INFO input.FileInputFormat: Total input paths > to > > > > >> process : > > > > >> > 1 > > > > >> > 10/11/21 02:53:01 INFO mapred.JobClient: Running job: > > > > >> job_201011210240_0002 > > > > >> > 10/11/21 02:53:02 INFO mapred.JobClient: map 0% reduce 0% > > > > >> > 10/11/21 02:53:08 INFO mapred.JobClient: Task Id : > > > > >> > attempt_201011210240_0002_m_000036_0, Status : FAILED > > > > >> > java.lang.RuntimeException: java.lang.ClassNotFoundException: > > > > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat > > > > >> > at > > > > >> > > > > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809) > > > > >> > at > > > > >> > > > > > >> > > > > > > > > > > org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193) > > > > >> > at > org.apache.hadoop.mapred.Task.initialize(Task.java:413) > > > > >> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:288) > > > > >> > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > > > >> > Caused by: java.lang.ClassNotFoundException: > > > > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat > > > > >> > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > > > > >> > at java.security.AccessController.doPrivileged(Native > > Method) > > > > >> > at > > java.net.URLClassLoader.findClass(URLClassLoader.java:190) > > > > >> > at java.lang.ClassLoader.loadClass(ClassLoader.java:307) > > > > >> > at > > > > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > > > > >> > at java.lang.ClassLoader.loadClass(ClassLoader.java:248) > > > > >> > at java.lang.Class.forName0(Native Method) > > > > >> > at java.lang.Class.forName(Class.java:247) > > > > >> > at > > > > >> > > > > > >> > > > > > > > > > > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762) > > > > >> > at > > > > >> > > > > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807) > > > > >> > ... 4 more > > > > >> > > > > > >> > What could be the probable reasons for this? I have made sure > that > > > > >> > hbase-0.20.6.jar, which contains this particular class, is > > included > > > in > > > > >> the > > > > >> > class path. In fact, if I run non-m/r jobs, it works fine. e.g, > I > > > ran > > > > a > > > > >> jar > > > > >> > file successfully that uses HAdmin to create some tables. Here > is > > a > > > > part > > > > >> of > > > > >> > the output from these jobs: > > > > >> > > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client > > > > >> > environment:java.vendor=Sun Microsystems Inc. > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client > > > > >> > environment:java.home=/usr/java/jdk1.6.0_22/jre > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client > > > > >> > > > > > >> > > > > > > > > > > environment:java.class.path=/opt/hadoop/bin/../conf:/usr/java/jdk1.6.0_22/lib/tools.jar:/opt/hadoop/bin/.. > > > > >> > > > > > >> > > > > > > > > > > :/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/com > > > > >> > > > > > >> > > > > > > > > > > mons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-a > > > > >> > > > > > >> > > > > > > > > > > pi-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/.. > > > > >> > > > > > >> > > > > > > > > > > /lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14. > > > > >> > > > > > >> > > > > > > > > > > jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hbase/hbase-0.20.6.jar:/opt/hbase/hbase-0.20.6-test.jar:/opt/hbase/conf:/opt/hbase/lib/zookeeper-3.2.2.jar > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client > > > > >> > > > > > >> > > > > > > > > > > environment:java.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64 > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client > > > > >> > environment:java.io.tmpdir=/tmp > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client > > > > >> > environment:java.compiler=<NA> > > > > >> > > > > > >> > As you can see, /opt/hbase/hbase-0.20.6.jar is included in the > > > > classpath. > > > > >> > What else could be it? > > > > >> > > > > > >> > thanks, > > > > >> > hari > > > > >> > > > > > > > > > > > > > > >
