On Sep 11, 2011, at 1:33 PM, Varun Thacker wrote:

> I'm using Mahout 0.5.I am using Lucene ( the matching version in the
> pom.xml) to index a tiny data set for testing. This is what the index looks
> like:
> _0.fdt
> _0.fnm
> _0.nrm
> _0.tii
> _0.tvd
> _0.tvx
> segments.gen
> _0.fdx
> _0.frq
> _0.prx
> _0.tis
> _0.tvf
> segments_1
> 
> Now I  use this command to create vectors the from Lucene Index ( same as
> the wiki command)
> 
> ./mahout lucene.vector --dir /home/varun/myindex/ --field title --dictOut
> /home/varun/myindex/dict.txt --output /home/varun/myindex/out.txt --norm 1
> 
> Now I copy paste the /myindex folder to the /bin/testdata folder as that
> seems to be the default dir. for the data
> 
> To run K-means I use this command:
> ./mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job

I don't think this is what you do to run K-means.  

You should be able to do:
./mahout kmeans --input ... 

> 
> This is the error which I get: http://pastebin.com/ADPm0Vbx
> 
> Am I missing any steps?
> 
> Also on a side note is there a post on using MinHash in Mahout?
> 
> 
> -- 
> 
> 
> Regards,
> Varun Thacker
> http://varunthacker.wordpress.com

--------------------------------------------
Grant Ingersoll
http://www.lucidimagination.com
Lucene Eurocon 2011: http://www.lucene-eurocon.com

Reply via email to