Check out ClusterDumper in utils (utils/src/main/java/org/apache/mahout/utils/clustering/ClusterDumper.java). This utility will print cluster ID and the associated vector IDs.
--shashi On Wed, Nov 25, 2009 at 5:47 AM, Liang Chenmin <[email protected]> wrote: > Hi all, > I am a newbie to Mahout. I have a question about how to incorporate some > naming for cluster and points in the synthetic data cluster example. > > After getting the output of the synthetic data cluster, we have 6 > clusters, and each one looks like: > > ###First is the information of the cluster > 0:name::{"class":"org.apache.mahout.matrix.SparseVector","vector":"{\"values\":{\"indices\":[0,1,2...59],\"values\":[29.58838112577385,...],\"numMappings\":60},\"cardinality\":60,\"lengthSquared\":-1.0,\"name\":\"\"}"} > > ###And then follow by points belong to this cluster: > Points: > {"class":"org.apache.mahout.matrix.SparseVector","vector":"{\"values\":{\"indices\":[0,1,2,...,59],\"values\":[28.7812,34.4632,...... > ],],\"numMappings\":60},\"cardinality\":60,\"lengthSquared\":-1.0,\"name\":\"\"}"}, > > {"class":"org.apache.mahout.matrix.SparseVector","vector":"{\"values\":{\"indices\" > .... > > > Is there a way for me to specify the name of the cluster? And more > importantly, if I actually have ID for each point, how could I show the ID > for each point in the final result? I want to see clearly the IDs in each > cluster. I have used my own data also, and the output is similar to the ones > above, although the indices are not the same as my matrix are sparse. And as > my data set is large, getting the IDs is quite important for me. > > Thanks, > Mandy >
