1) Are you sure you can't use Mahout command line? if no, try command line, otherwise proceed to #2.
2) Are you resolved to run it embedded client side? if no, go back to command line use. if yes, your best bet is to build a maven project. Unfortunately i cannot help you with maven references within framework of this list. I think you need some maven resource to read thru how to build that. 3) Are you also running MR backend-side with mahout dependencies as well? If yes, you need something called mahout-core-0.6-SNAPSHOT-job.jar (if you build Mahout from source, it will land in core/target folder). That's something called "hadoop job" jar which you can redistribute to MR backend tasks. If that's what you want to do, try to ask on Hadoop forums how to do it in your mapreduce-enabled applications, I am not really 100% sure myself. Standard hadoop command takes those with --jar option. 4) Sometimes it is also needed to do something of inverse nature: to include some of _your_ libraries running in backend with Mahout tasks. (example being: custom lucene text analyzer for text inputs). I think it may be also achievable with mahout command line option by using the same standard --jar option for your own hadoop job jar, but I am not 100% sure. I did somethnig like that long ago but i can't remember how it was done now. Thanks. -Dmitriy On Thu, Dec 29, 2011 at 1:02 AM, rahul raghavendhra <[email protected]> wrote: > It sound better.. can u please elaborate so that new uses like me can > learn.. thanks Dmitry.. Please help.. thanks in advance > > ./rahul > > > On Thu, Dec 29, 2011 at 2:07 PM, Dmitriy Lyubimov <[email protected]> wrote: > >> > (I actually don't do that, I do it slightly >> >other way, by publishing all dependency jars of my project on hdfs and >> >then use DistributedCache to add them to my MR classpath, so i don't >> >know for sure about using mahout hadoop job jar outside the command >> line). >> >But command line is still probably the best way to try something, >> >embedding takes more time. >> >>
