Hi, Took a break from this task and moved on with some other tasks in list. When I re-visit this task again this morning I found some problem with sort utility and LC_COLLATE environment variable that would make my sequenceFile generation script fail. Now I managed to get the command line utility to generate the clusters
$ bin/mahout fkmeans --input test/sensei --output test/clusters --clusters test/clusters/clusters-0 --clustering --overwrite --emitMostLikely false --numClusters 3 --maxIter 10 --m 5 However, when I run cluster dumper, I only see the three cluster center points, but not the points although I included --clustering and --emitMostLikely options when I do the clustering $ ./bin/mahout clusterdump --seqFileDir test/clusters/clusters-1 --pointsDir test/clusters/clusteredPoints --output sensei.txt tested this with the latest revision of mahout-0.6-snapshot When I try to do clustering with my clojure code (same as the one posted before), it is still giving me the same error, any idea? Regards, Jeffrey04 >________________________________ >From: Jake Mannix <[email protected]> >To: [email protected]; Jeffrey <[email protected]> >Sent: Friday, August 26, 2011 1:23 AM >Subject: Re: Clustering (fkmeans) with Mahout using Clojure > > > > > >On Thu, Aug 25, 2011 at 10:11 AM, Jeffrey <[email protected]> wrote: > >I am trying to write a short script to cluster my data via clojure (calling >Mahout classes though). I have my input data in this format (which is an >output from a >> >> > > >This line you're instantiating a new SequentialAccessSparseVector, with the >value of cardinality being "count (vals photo_list)" - you need to have all of >your Vectors exist with the same cardinality (ie. they live in the same vector >space, mathematically). So you need to figure out how big they need to be, >and instantiate them *all* with this cardinality. > > (new SequentialAccessSparseVector (count >(vals photo_list))) >> > > > > >The error you are getting below: > > >EDIT: apparently cardinality needs to be 1, need to figure out how to do it >> > > >is actually telling you that you're trying to say all vectors should be >cardinality 1, but it found some vectors with cardinality 10, so it threw an >exception. > > -jake > >
