I'm testing a hadoop version upgrade on a prototype EC2 cluster, but
while I've now gotten most of it up and running (well, HDFS and HBase at
least) I'm hitting some odd problems getting our M/R jobs to run.
(I followed all the instructions at
https://wiki.cloudera.com/display/DOC/Hadoop+Upgrade+from+CDH2+or+CDH3b2+to+CDH3b3
as well as fixing a number of problems that came up in that process.)
They current problem I'm stuck on appears to be a classpath issue, but
one I can't figure out. When running a job I hit this error:
10/12/07 02:01:05 INFO mapred.JobClient: Task Id :
attempt_201012062243_0009_m_000182_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat
at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:973)
at
org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:236)
at org.apache.hadoop.mapred.Task.initialize(Task.java:484)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:298)
at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
at org.apache.hadoop.mapred.Child.main(Child.java:211)
We do use HFileOutputFormat in our M/R job, however as far as I can tell
that should be handled by out existing classpath:
10/12/07 02:07:25 INFO zookeeper.ZooKeeper: Client
environment:java.class.path=/usr/lib/hadoop-0.20/conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib
hadoop-0.20/hadoop-
...
/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar::/usr/lib/hbase/hbase.jar:/usr/lib/hbase/conf:/usr/lib/zookeeper/zookeeper.jar
/usr/lib/hbase/hbase.jar:/usr/lib/hbase/conf:/usr/lib/zookeeper/zookeeper.jar
It looks to me like HFileOutputFormat should be covered by that class path:
# jar tf /usr/lib/hbase/hbase.jar | grep HFileOutputFormat
org/apache/hadoop/hbase/mapreduce/HFileOutputFormat$WriterLength.class
org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.class
org/apache/hadoop/hbase/mapreduce/HFileOutputFormat$1.class
Any ideas here?
I have another similar issue, although with this one I have to assume
that some package that was previously included with the base cloudera
packages is no longer included:
Exception in thread "main" java.lang.NoClassDefFoundError:
com/google/common/base/Function
at
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:247)
at
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:81)
Thanks
- Adam