I'm testing a hadoop version upgrade on a prototype EC2 cluster, but while I've now gotten most of it up and running (well, HDFS and HBase at least) I'm hitting some odd problems getting our M/R jobs to run.

(I followed all the instructions at https://wiki.cloudera.com/display/DOC/Hadoop+Upgrade+from+CDH2+or+CDH3b2+to+CDH3b3 as well as fixing a number of problems that came up in that process.)

They current problem I'm stuck on appears to be a classpath issue, but one I can't figure out. When running a job I hit this error:

10/12/07 02:01:05 INFO mapred.JobClient: Task Id : attempt_201012062243_0009_m_000182_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapreduce.HFileOutputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:973) at org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:236)
        at org.apache.hadoop.mapred.Task.initialize(Task.java:484)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:298)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
        at org.apache.hadoop.mapred.Child.main(Child.java:211)

We do use HFileOutputFormat in our M/R job, however as far as I can tell that should be handled by out existing classpath:

10/12/07 02:07:25 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/usr/lib/hadoop-0.20/conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib hadoop-0.20/hadoop-
...
/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar::/usr/lib/hbase/hbase.jar:/usr/lib/hbase/conf:/usr/lib/zookeeper/zookeeper.jar /usr/lib/hbase/hbase.jar:/usr/lib/hbase/conf:/usr/lib/zookeeper/zookeeper.jar

It looks to me like HFileOutputFormat should be covered by that class path:

# jar tf /usr/lib/hbase/hbase.jar | grep HFileOutputFormat
org/apache/hadoop/hbase/mapreduce/HFileOutputFormat$WriterLength.class
org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.class
org/apache/hadoop/hbase/mapreduce/HFileOutputFormat$1.class

Any ideas here?

I have another similar issue, although with this one I have to assume that some package that was previously included with the base cloudera packages is no longer included:

Exception in thread "main" java.lang.NoClassDefFoundError: com/google/common/base/Function at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:247) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:81)

Thanks
- Adam

Reply via email to