Hi Syed,

Do you mean I need to deploy the mahout jars to the lib directory of
the master node? Or all the data nodes? Or is there a way to simply
tell the hadoop job launcher to upload the jars itself?

Steve

On Thu, Jul 26, 2012 at 6:10 PM, syed kather <in.ab...@gmail.com> wrote:
> Hi Steve ,
> I hope you had missed that Sep ific jar to copy into your Hadoop lib
> directories.  Have a look on ur lib .
> On Jul 27, 2012 4:49 AM, "Steve Armstrong" <st...@stevearm.com> wrote:
>
>> Hello,
>>
>> I'm trying to trigger a Mahout job from inside my Java application
>> (running in Eclipse), and get it running on my cluster. I have a main
>> class that simply contains:
>>
>> String[] args = new String[] { "--input", "/input/triples.csv",
>> "--output", "/output/vectors.txt", "--similarityClassname",
>> VectorSimilarityMeasures.SIMILARITY_COOCCURRENCE.toString(),
>> "--numRecommendations", "10000", "--tempDir", "temp/" +
>> System.currentTimeMillis() };
>> Configuration conf = new Configuration();
>> ToolRunner.run(conf, new RecommenderJob(), args);
>>
>> If I package the whole project up in a single jar (using Maven), copy
>> it to the namenode, and run it with "hadoop jar project.jar" it works
>> fine. But if I try and run it from my dev pc in Eclipse (where all the
>> same dependencies are still in the classpath), and add the 3 hadoop
>> xml files to the classpath, it triggers hadoop jobs, but they fail
>> with errors like:
>>
>> 12/07/26 14:42:09 INFO mapred.JobClient: Task Id :
>> attempt_201206261211_0173_m_000001_0, Status : FAILED
>> Error: java.lang.ClassNotFoundException: com.google.common.primitives.Longs
>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>> ...
>>
>> What I'm trying to create is a self-contained JAR that can be run from
>> the command-line and launch the mahout job on the cluster. I've got
>> this all working with embedded pig scripts, but I can't get it working
>> here.
>>
>> Any help is appreciated, or advice on better ways to trigger the jobs from
>> code.
>>
>> Thanks
>>

Reply via email to