I have svn the trunk of mahout-snapshot and i installed using maven.. I have tested Kmeans ans it works well.. How to run examples that are in Mahout in Action book ? what are the steps to follow.. please help..
On Fri, Dec 30, 2011 at 3:02 AM, Ted Dunning <[email protected]> wrote: > Here are some sample maven projects that use mahout. You can copy the > dependencies from the pom.xml file after you set up an empty project. Or > you can copy this project and delete all the code before inserting your > own. > > https://github.com/tdunning/Chapter-16 > > https://github.com/tdunning/pig-vector > > On Thu, Dec 29, 2011 at 12:25 PM, Dmitriy Lyubimov <[email protected] > >wrote: > > > 1) Are you sure you can't use Mahout command line? > > > > if no, try command line, otherwise proceed to #2. > > > > 2) Are you resolved to run it embedded client side? > > > > if no, go back to command line use. > > if yes, your best bet is to build a maven project. Unfortunately i > > cannot help you with maven references within framework of this list. I > > think you need some maven resource to read thru how to build that. > > > > > > 3) Are you also running MR backend-side with mahout dependencies as well? > > If yes, you need something called mahout-core-0.6-SNAPSHOT-job.jar (if > > you build Mahout from source, it will land in core/target folder). > > That's something called "hadoop job" jar which you can redistribute to > > MR backend tasks. If that's what you want to do, try to ask on Hadoop > > forums how to do it in your mapreduce-enabled applications, I am not > > really 100% sure myself. Standard hadoop command takes those with > > --jar option. > > > > 4) Sometimes it is also needed to do something of inverse nature: to > > include some of _your_ libraries running in backend with Mahout tasks. > > (example being: custom lucene text analyzer for text inputs). I think > > it may be also achievable with mahout command line option by using the > > same standard --jar option for your own hadoop job jar, but I am not > > 100% sure. I did somethnig like that long ago but i can't remember how > > it was done now. > > > > Thanks. > > -Dmitriy > > > > On Thu, Dec 29, 2011 at 1:02 AM, rahul raghavendhra > > <[email protected]> wrote: > > > It sound better.. can u please elaborate so that new uses like me can > > > learn.. thanks Dmitry.. Please help.. thanks in advance > > > > > > ./rahul > > > > > > > > > On Thu, Dec 29, 2011 at 2:07 PM, Dmitriy Lyubimov <[email protected]> > > wrote: > > > > > >> > (I actually don't do that, I do it slightly > > >> >other way, by publishing all dependency jars of my project on hdfs > and > > >> >then use DistributedCache to add them to my MR classpath, so i don't > > >> >know for sure about using mahout hadoop job jar outside the command > > >> line). > > >> >But command line is still probably the best way to try something, > > >> >embedding takes more time. > > >> > > >> > > >
