Any pointers on where I should be looking at ? On Sun, Sep 11, 2011 at 11:03 PM, Varun Thacker <[email protected]>wrote:
> I'm using Mahout 0.5.I am using Lucene ( the matching version in the > pom.xml) to index a tiny data set for testing. This is what the index looks > like: > _0.fdt > _0.fnm > _0.nrm > _0.tii > _0.tvd > _0.tvx > segments.gen > _0.fdx > _0.frq > _0.prx > _0.tis > _0.tvf > segments_1 > > Now I use this command to create vectors the from Lucene Index ( same as > the wiki command) > > ./mahout lucene.vector --dir /home/varun/myindex/ --field title --dictOut > /home/varun/myindex/dict.txt --output /home/varun/myindex/out.txt --norm 1 > > Now I copy paste the /myindex folder to the /bin/testdata folder as that > seems to be the default dir. for the data > > To run K-means I use this command: > ./mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job > > This is the error which I get: http://pastebin.com/ADPm0Vbx > > Am I missing any steps? > > Also on a side note is there a post on using MinHash in Mahout? > > > -- > > > Regards, > Varun Thacker > http://varunthacker.wordpress.com > > > -- Regards, Varun Thacker http://varunthacker.wordpress.com
