On Jul 28, 2011, at 6:47 AM, Dave Gettier wrote: > Can you elaborate on that? Not sure what "--method mapreduce" means. > > Some details: > > - I have a CSV file. > - I programmatically create namedVectors (a java program). > - I run the kmeansdriver which (I thought) submits a number of mapreduce > jobs.
Are you running KMeansDriver on trunk? I'm pretty sure it has two options for method: --method sequential or mapreduce Not sure if that is the same driver as used in Mahout in Action. Are you actually running on a Hadoop cluster or just locally? > - The output is another sequence file, which I then turn back into CSV. > - All sequence files are stored locally. > > > Side note: > I am overriding the distance method. > I can which distance method via a config file. > I can run Euclidian, or my distance function. > > -----Original Message----- > From: Grant Ingersoll [mailto:[email protected]] > Sent: Thursday, July 28, 2011 9:23 AM > To: [email protected] > Subject: Re: Kmeans runs successfully, but no map/reduce jobs > > Do you need --method mapreduce passed in? > > On Jul 27, 2011, at 4:20 PM, Dave Gettier wrote: > >> >> I am running a kmeans application which was adapted from example 7.2 of >> Mahout in Action. The java program runs successfully, giving me the >> expected results; however, there are no map/reduce jobs being kicked off. >> My understanding was that KMeansCluster runs locally, but KMeansDriver run >> on the cluster. How does one point the job to run on the cluster? Or am I >> missing something? >> >> KMeansDriver.run(conf, >> new Path(cp.getsDataDir() + "/points"), >> new Path(cp.getsDataDir() + "/clusters"), >> new Path(cp.getsDataDir() + "/outputs"), >> new EuclideanDistanceMeasure(), >> .001,10, true, true); >> >> Thanks in advance! >> >> -DG >> > > -------------------------------------------- > Grant Ingersoll > > -------------------------------------------- Grant Ingersoll
