I'm using Mahout 0.5.I am using Lucene ( the matching version in the
pom.xml) to index a tiny data set for testing. This is what the index looks
like:
_0.fdt
_0.fnm
_0.nrm
_0.tii
_0.tvd
_0.tvx
segments.gen
_0.fdx
_0.frq
_0.prx
_0.tis
_0.tvf
segments_1

Now I  use this command to create vectors the from Lucene Index ( same as
the wiki command)

./mahout lucene.vector --dir /home/varun/myindex/ --field title --dictOut
/home/varun/myindex/dict.txt --output /home/varun/myindex/out.txt --norm 1

Now I copy paste the /myindex folder to the /bin/testdata folder as that
seems to be the default dir. for the data

To run K-means I use this command:
./mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job

This is the error which I get: http://pastebin.com/ADPm0Vbx

Am I missing any steps?

Also on a side note is there a post on using MinHash in Mahout?


-- 


Regards,
Varun Thacker
http://varunthacker.wordpress.com

Reply via email to