The dictionary file contains a list (not sure how its delimited) of
element names for the input Vectors and is optional. See the new code in
trunk/utils in TestClusterDumper for some examples. I need to write test
sfor meanshift and also fuzzy kmeans to make sure they work but I
imagine they do. I also need to write tests that include the points, but
that appears to be done in memory so it likely won't scale to your
5-node data set.
Jeff
adam35413 wrote:
I have been able to successfully run the kmean and meanshift examples on a
5-node Hadoop cluster. However, when it comes to dealing with the output, I
am a bit confused. I found the following page:
http://cwiki.apache.org/MAHOUT/viewing-results.html, but when I went to
track down the dictionary file I was unable to find it. Do I need to
generate the dictionary file separately or manually?
Thanks!