Pe 03.11.2011 10:53, WangRamon a scris:
>
>
>
>
> Hi All I'm using KMeansClusterer, I will use KMeansDriver on a Hadoop
> environment later, but I think it will be easy to understand it by using
> KMeansClusterer, OK, so the question is i cannot find a way to find the
> cluster a point should belong to after running KMeansClusterer, I expect I
> can get some API on the Cluster interface to get all points/vector belong to
> this cluster, but... so did i miss something? Thanks a lot. Cheers Ramon
>
You can find your answer in the clusterDumper utility. Check the code at
[1]. I have modified printCluster method to suppress some info and print
the cluster points with something like:
String clusterInfo = String.format("Cluster %d (%d) with %d points.\n",
value.getId(), clusterCount, value.getNumPoints());
List<WeightedVectorWritable> points = clusterIdToPoints.get(value.getId());
if (points != null) {
writer.write("\tCluster points:\n\t");
for (Iterator<WeightedVectorWritable> iterator =
points.iterator(); iterator.hasNext();) {
WeightedVectorWritable point = iterator.next();
writer.write(String.valueOf(point.getWeight()));
writer.write(": ");
if (point.getVector() instanceof NamedVector) {
writer.write(((NamedVector)
point.getVector()).getName() + " ");
}
}
writer.write('\n');
}
[1]
http://grepcode.com/file/repo1.maven.org/maven2/org.apache.mahout/mahout-utils/0.2/org/apache/mahout/utils/clustering/ClusterDumper.java
[2] https://cwiki.apache.org/MAHOUT/cluster-dumper.html
--
Ioan Eugen Stan
Big Data Solutions Development Romania
1&1 Internet Development SRL
Str Mircea Eliade 18
Sect 1, Bucuresti Tel : +40 312 23-9254
012015, Romania EMail: [email protected]