Trigger job from Java application causes ClassNotFound

Steve Armstrong Thu, 26 Jul 2012 16:19:21 -0700

Hello,

I'm trying to trigger a Mahout job from inside my Java application
(running in Eclipse), and get it running on my cluster. I have a main
class that simply contains:


String[] args = new String[] { "--input", "/input/triples.csv",
"--output", "/output/vectors.txt", "--similarityClassname",
VectorSimilarityMeasures.SIMILARITY_COOCCURRENCE.toString(),
"--numRecommendations", "10000", "--tempDir", "temp/" +
System.currentTimeMillis() };
Configuration conf = new Configuration();
ToolRunner.run(conf, new RecommenderJob(), args);

If I package the whole project up in a single jar (using Maven), copy
it to the namenode, and run it with "hadoop jar project.jar" it works
fine. But if I try and run it from my dev pc in Eclipse (where all the
same dependencies are still in the classpath), and add the 3 hadoop
xml files to the classpath, it triggers hadoop jobs, but they fail
with errors like:

12/07/26 14:42:09 INFO mapred.JobClient: Task Id :
attempt_201206261211_0173_m_000001_0, Status : FAILED
Error: java.lang.ClassNotFoundException: com.google.common.primitives.Longs
        at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
...

What I'm trying to create is a self-contained JAR that can be run from
the command-line and launch the mahout job on the cluster. I've got
this all working with embedded pig scripts, but I can't get it working
here.

Any help is appreciated, or advice on better ways to trigger the jobs from code.

Thanks

Trigger job from Java application causes ClassNotFound

Reply via email to