RE: hadoop-hbase failure - could use some help, a class is apparently not being found by Hadoop

Buttler, David Mon, 20 Sep 2010 10:17:27 -0700

I find it is often faster to skip the reduce phase when updating rows in hbase. 
 (A trick I picked up from Ryan)
Essentially, you read a row from hbase, do your processing, and write the row 
back to hbase.
The only time you would want to do the reduce phase is if there is some 
aggregation that you need, or if there is some output you want to skip (e.g. 
you have a zipfian distribution and you want to ignore the low count 
occurrences).

Dave

-----Original Message-----
From: Taylor, Ronald C [mailto:[email protected]]
Sent: Friday, September 17, 2010 4:19 PM
To: '[email protected]'
Cc: Taylor, Ronald C
Subject: hadoop-hbase failure - could use some help, a class is apparently not 
being found by Hadoop

Hi folks,

Got a problem in basic Hadoop-Hbase communication. My small test program 
ProteinCounter1.java - shown in full below - reports out this error

   java.lang.RuntimeException: java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.mapreduce.TableOutputFormat
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)

The full invocation and error msgs are shown at bottom.

We are using Hadoop 20.2 with HBase0.89.2010726 on a 24-node cluster. Hadoop 
and Hbase each appears to work fine separately. That is, I've created programs 
that run MapReduce on files, and programs that import data into Hbase tables 
and manipulate such. Both types of programs have gone quite smoothly.

Now I want to combine the two - use MapReduce programs on data drawn from an 
Hbase table, with results placed back into an Hbase table.

But my test program for such, as you see from the error msg, is not working. 
Apparently the
   org.apache.hadoop.hbase.mapreduce.TableOutputFormat
 class is not found.

However, I have added these paths, including the relevant Hbase *.jar, to 
HADOOP_CLASSPATH, so the missing class should have been found, as you can see:

 export HADOOP_CLASSPATH=/home/hbase/hbase/conf: 
/home/hbase/hbase/hbase-0.89.20100726.jar: 
/home/rtaylor/HadoopWork/log4j-1.2.16.jar: 
/home/rtaylor/HadoopWork/zookeeper-3.3.1.jar

 This change was made in the ../hadoop/conf/hadoop-env.sh file.

I checked the manifest of /home/hbase/hbase/hbase-0.89.20100726.jar and
    org/apache/hadoop/hbase/mapreduce/TableOutputFormat.class
 is indeed present that Hbase *.jar file.

Also, I have restarted both Hbase and Hadoop after making this change.

Don't understand why the TableOutputFormat class is not being found. Or is the 
error msg misleading, and something else is going wrong? I would very much 
appreciate any advice people have as to what is going wrong. Need to get this 
working very soon.

   Regards,
     Ron T.

___________________________________________
Ronald Taylor, Ph.D.
Computational Biology & Bioinformatics Group
Pacific Northwest National Laboratory
902 Battelle Boulevard
P.O. Box 999, Mail Stop J4-33
Richland, WA  99352 USA
Office:  509-372-6568
Email: [email protected]

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

contents of the "ProteinCounter1.java" file:

//  to compile
// javac ProteinCounter1.java
// jar cf ProteinCounterTest.jar  *.class

// to run
//   hadoop jar ProteinCounterTest.jar ProteinCounter1

import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.io.IntWritable;

import java.util.*;
import java.io.*;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.io.*;
import org.apache.hadoop.hbase.util.*;
import org.apache.hadoop.hbase.mapreduce.*;

// %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

/**
 * counts the number of times each protein appears in the proteinTable
 *
 */
public class ProteinCounter1 {

    static class ProteinMapper1 extends TableMapper<ImmutableBytesWritable, 
IntWritable> {

        private int numRecords = 0;
        private static final IntWritable one = new IntWritable(1);

        @Override
            public void map(ImmutableBytesWritable row, Result values, Context 
context) throws IOException {

            // retrieve the value of proteinID, which is the row key for each 
protein in the proteinTable
            ImmutableBytesWritable proteinID_Key = new 
ImmutableBytesWritable(row.get());
            try {
                context.write(proteinID_Key, one);
            } catch (InterruptedException e) {
                throw new IOException(e);
            }
            numRecords++;
            if ((numRecords % 100) == 0) {
                context.setStatus("mapper processed " + numRecords + " 
proteinTable records so far");
            }
        }
    }

    public static class ProteinReducer1 extends 
TableReducer<ImmutableBytesWritable,
                                               IntWritable, 
ImmutableBytesWritable> {

        public void reduce(ImmutableBytesWritable proteinID_key, 
Iterable<IntWritable> values,
                            Context context)
            throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable val : values) {
                sum += val.get();
            }

            Put put = new Put(proteinID_key.get());
            put.add(Bytes.toBytes("resultFields"), Bytes.toBytes("total"), 
Bytes.toBytes(sum));
            System.out.println(String.format("stats : proteinID_key : %d, count 
: %d",
                                           Bytes.toInt(proteinID_key.get()), 
sum));
            context.write(proteinID_key, put);
        }
    }

    public static void main(String[] args) throws Exception {

        org.apache.hadoop.conf.Configuration conf;
           conf = org.apache.hadoop.hbase.HBaseConfiguration.create();

        Job job = new Job(conf, "HBaseTest_Using_ProteinCounter");
        job.setJarByClass(ProteinCounter1.class);

        org.apache.hadoop.hbase.client.Scan scan = new Scan();

        String colFamilyToUse = "proteinFields";
        String fieldToUse = "Protein_Ref_ID";

        // retreive this one column from the specified family
        scan.addColumn(Bytes.toBytes(colFamilyToUse), 
Bytes.toBytes(fieldToUse));

           org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter filterToUse =
                 new org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter();
        scan.setFilter(filterToUse);

        TableMapReduceUtil.initTableMapperJob("proteinTable", scan, 
ProteinMapper1.class,
                              ImmutableBytesWritable.class,
                                              IntWritable.class, job);
        TableMapReduceUtil.initTableReducerJob("testTable", 
ProteinReducer1.class, job);
        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

session output:

[rtay...@h01 Hadoop]$ javac ProteinCounter1.java

[rtay...@h01 Hadoop]$ jar cf ProteinCounterTest.jar  *.class

[rtay...@h01 Hadoop]$ hadoop jar ProteinCounterTest.jar ProteinCounter1

10/09/17 15:46:18 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.job.tracker;  Ignoring.
10/09/17 15:46:18 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.local.dir;  Ignoring.
10/09/17 15:46:18 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.system.dir;  Ignoring.
10/09/17 15:46:18 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.tasktracker.map.tasks.maximum;  Ignoring.
10/09/17 15:46:18 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.tasktracker.reduce.tasks.maximum;  Ignoring.
10/09/17 15:46:18 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
the arguments. Applications should implement Tool for the same.
10/09/17 15:46:18 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.job.tracker;  Ignoring.
10/09/17 15:46:18 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.local.dir;  Ignoring.
10/09/17 15:46:18 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.system.dir;  Ignoring.
10/09/17 15:46:18 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.tasktracker.map.tasks.maximum;  Ignoring.
10/09/17 15:46:18 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.tasktracker.reduce.tasks.maximum;  Ignoring.
10/09/17 15:46:18 WARN conf.Configuration: hdfs-default.xml:a attempt to 
override final parameter: dfs.name.dir;  Ignoring.
10/09/17 15:46:18 WARN conf.Configuration: hdfs-default.xml:a attempt to 
override final parameter: dfs.data.dir;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.job.tracker;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.local.dir;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.system.dir;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.tasktracker.map.tasks.maximum;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.tasktracker.reduce.tasks.maximum;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: hdfs-default.xml:a attempt to 
override final parameter: dfs.name.dir;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: hdfs-default.xml:a attempt to 
override final parameter: dfs.data.dir;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.job.tracker;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.local.dir;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.system.dir;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.tasktracker.map.tasks.maximum;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: mapred-default.xml:a attempt to 
override final parameter: mapred.tasktracker.reduce.tasks.maximum;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: hdfs-default.xml:a attempt to 
override final parameter: dfs.name.dir;  Ignoring.
10/09/17 15:46:19 WARN conf.Configuration: hdfs-default.xml:a attempt to 
override final parameter: dfs.data.dir;  Ignoring.
10/09/17 15:46:19 INFO zookeeper.ZooKeeperWrapper: Reconnecting to zookeeper
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client 
environment:zookeeper.version=3.3.1-942149, built on 05/07/2010 17:14 GMT
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client 
environment:host.name=h01.emsl.pnl.gov
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client 
environment:java.version=1.6.0_21
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Sun 
Microsystems Inc.
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client 
environment:java.home=/usr/java/jdk1.6.0_21/jre
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client 
environment:java.class.path=/home/hadoop/hadoop/bin/../conf:/usr/java/default/lib/tools.jar:/home/hadoop/hadoop/bin/..:/home/hadoop/hadoop/bin/../hadoop-0.20.2-core.jar:/home/hadoop/hadoop/bin/../lib/commons-cli-1.2.jar:/home/hadoop/hadoop/bin/../lib/commons-codec-1.3.jar:/home/hadoop/hadoop/bin/../lib/commons-el-1.0.jar:/home/hadoop/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/home/hadoop/hadoop/bin/../lib/commons-logging-1.0.4.jar:/home/hadoop/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/home/hadoop/hadoop/bin/../lib/commons-net-1.4.1.jar:/home/hadoop/hadoop/bin/../lib/core-3.1.1.jar:/home/hadoop/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/home/hadoop/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/home/hadoop/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/home/hadoop/hadoop/bin/../lib/jets3t-0.6.1.jar:/home/hadoop/hadoop/bin/../lib/jetty-6.1.14.jar:/home/hadoop/hadoop/bin/../lib/jetty-util-6.1.14.jar:/home/hadoop/hadoop/bin/../lib/junit-3.8.1.jar:/home/hadoop/hadoop/bin/../lib/kfs-0.2.2.jar:/home/hadoop/hadoop/bin/../lib/log4j-1.2.15.jar:/home/hadoop/hadoop/bin/../lib/mockito-all-1.8.0.jar:/home/hadoop/hadoop/bin/../lib/oro-2.0.8.jar:/home/hadoop/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/home/hadoop/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/home/hadoop/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/home/hadoop/hadoop/bin/../lib/xmlenc-0.52.jar:/home/hadoop/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/home/hadoop/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/home/hbase/hbase/conf:/home/hbase/hbase/hbase-0.89.20100726.jar:/home/rtaylor/HadoopWork/log4j-1.2.16.jar:/home/rtaylor/HadoopWork/zookeeper-3.3.1.jar
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client 
environment:java.library.path=/home/hadoop/hadoop/bin/../lib/native/Linux-i386-32
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client 
environment:java.io.tmpdir=/tmp
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client 
environment:java.compiler=<NA>
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client environment:os.arch=i386
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client 
environment:os.version=2.6.18-194.11.1.el5
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client environment:user.name=rtaylor
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client 
environment:user.home=/home/rtaylor
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Client 
environment:user.dir=/home/rtaylor/HadoopWork/Hadoop
10/09/17 15:46:19 INFO zookeeper.ZooKeeper: Initiating client connection, 
connectString=h05:2182,h04:2182,h03:2182,h02:2182,h10:2182,h09:2182,h08:2182,h07:2182,h06:2182
 sessionTimeout=60000 
watcher=org.apache.hadoop.hbase.zookeeper.zookeeperwrap...@dcb03b

10/09/17 15:46:19 INFO zookeeper.ClientCnxn: Opening socket connection to 
server h04/192.168.200.24:2182
10/09/17 15:46:19 INFO zookeeper.ClientCnxn: Socket connection established to 
h04/192.168.200.24:2182, initiating session
10/09/17 15:46:19 INFO zookeeper.ClientCnxn: Session establishment complete on 
server h04/192.168.200.24:2182, sessionid = 0x22b21c04c330002, negotiated 
timeout = 60000
10/09/17 15:46:20 INFO mapred.JobClient: Running job: job_201009171510_0004
10/09/17 15:46:21 INFO mapred.JobClient:  map 0% reduce 0%

10/09/17 15:46:27 INFO mapred.JobClient: Task Id : 
attempt_201009171510_0004_m_000002_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.mapreduce.TableOutputFormat
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
        at 
org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193)
        at org.apache.hadoop.mapred.Task.initialize(Task.java:413)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:288)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)

Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.mapreduce.TableOutputFormat
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
        ... 4 more

10/09/17 15:46:33 INFO mapred.JobClient: Task Id : 
attempt_201009171510_0004_r_000051_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.mapreduce.TableOutputFormat
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
        at 
org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193)
        at org.apache.hadoop.mapred.Task.initialize(Task.java:413)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:354)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.mapreduce.TableOutputFormat
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
        ... 4 more

I terminated the program here via <Control><C>, since the error msgs were 
simply repeating.

[rtay...@h01 Hadoop]$

RE: hadoop-hbase failure - could use some help, a class is apparently not being found by Hadoop

Reply via email to