Yep. Here's the script I'm using...everything is happy until the job
executes under the configuration that uses AccumuloFileOutputFormat
class...
HADOOP_BIN=/cloudbase/hadoop-0.20.2/bin
ACCUMULO_BIN=/cloudbase/accumulo-1.4.0/bin
INGESTER_JAR=/mnt/hgfs/CSI.Cloudbase/Java/CloudbaseServices/out/artifact
s/CloudbaseIngesters/CloudbaseIngesters.jar
PLACEMARK_CLASS=com.comcept.cloudbase.ingesters.placemarks.PlacemarkInge
ster
CONFIG=/mnt/hgfs/CSI.Cloudbase/Java/CloudbaseServices/out/artifacts/Clou
dbaseIngesters/placemark-config.xml
KXML_JAR=/usr/lib/ncct/kxml2-2.3.0.jar
XMLPULL_JAR=/usr/lib/ncct/xmlpull-1.1.3.1.jar
XSTREAM_JAR=/usr/lib/ncct/xstream-1.4.1.jar
INGESTER_LIBS=$KXML_JAR,$XMLPULL_JAR,$XSTREAM_JAR
$HADOOP_BIN/hadoop dfs -ls /
$HADOOP_BIN/hadoop dfs -rmr /output
$HADOOP_BIN/hadoop dfs -rmr /input
$HADOOP_BIN/hadoop dfs -mkdir /input
$HADOOP_BIN/hadoop dfs -mkdir /output
$HADOOP_BIN/hadoop dfs -mkdir /output/pfailures
$HADOOP_BIN/hadoop dfs -mkdir /output/gfailures
$HADOOP_BIN/hadoop dfs -mkdir /output/efailures
$HADOOP_BIN/hadoop dfs -mkdir /output/tfailures
$HADOOP_BIN/hadoop dfs -put ./*.kml /input
$ACCUMULO_BIN/tool.sh $INGESTER_JAR $PLACEMARK_CLASS -libjars
$INGESTER_LIBS -c $CONFIG
Here is the code that initializes the first job in the chain...
conf.set(_sVisTag, ic.getVisibility());
Job job = new Job(conf, "NCCT Placemark Ingester");
job.setJarByClass(this.getClass());
job.setInputFormatClass(TextInputFormat.class);
job.setMapperClass(PlacemarkMapClass.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.setReducerClass(PlacemarkReduceClass.class);
job.setOutputFormatClass(AccumuloFileOutputFormat.class);
AccumuloFileOutputFormat.setZooKeeperInstance(conf, ic.getInstance(),
ic.getZooKeeper());
Instance instance = new
ZooKeeperInstance(ic.getInstance(), ic.getZooKeeper());
Connector connector =
instance.getConnector(ic.getUserName(), password);
TextInputFormat.setInputPaths(job,new
Path(ic.getHdfsInput()));
AccumuloFileOutputFormat.setOutputPath(job, new
Path(ic.getHdfsOutput() + "/pfiles"));
job.waitForCompletion(true);
connector.tableOperations().importDirectory(ic.getMetaTable(),
ic.getHdfsOutput() + "/pfiles", ic.getHdfsOutput() + "/pfailures",
false);
From: John Vines [mailto:[email protected]]
Sent: Tuesday, May 22, 2012 09:57
To: [email protected]
Subject: Re: AcculumoFileOutputFormat class cannot be found by child jvm
Does your script utilize $ACCUMULO_HOME/bin/tool.sh to kick off the
mapreduce? That script is similar to hadoop jar, but it will libjar the
accumulo libraries for you.
John
On Tue, May 22, 2012 at 10:55 AM, <[email protected]> wrote:
Right now I'm using stand-alone mode, but is there another place I need
to put the jar file?
-----Original Message-----
From: John Armstrong [mailto:[email protected]]
Sent: Tuesday, May 22, 2012 09:49
To: [email protected]
Subject: Re: AcculumoFileOutputFormat class cannot be found by child jvm
On 05/22/2012 10:40 AM, [email protected] wrote:
> I upgrade to accumulo-1.4.0 and updated my map/reduce jobs and now
> they don't run. The parent class path has the accumulo-core-1.4.0.jar
> file included. Do the accumulo jar files have to be manually put on a
> distribute cache? Any help is appreciated.
Just to check: did you replace the Accumulo JAR files on all the cluster
nodes?