Hi,
I have got a piece of code which creates for me few clusters with vectors.
When I run it, I can see a log which says that 2 clusters have been created
with 2 central points:
INFO CanopyDriver - Build Clusters Input:
C:/Users/xxxxxxx/Documents/jboss-as-7.1.1.Final/jboss-as-7.1.1.Final/bin/BI/synthetic_control.seq
Out:
C:/Users/xxxxxxx/Documents/jboss-as-7.1.1.Final/jboss-as-7.1.1.Final/bin/BI/output
Measure: org.apache.mahout.common.distance.EuclideanDistanceMeasure@7faf9b87
t1: 5.0 t2: 9.0
DEBUG CanopyClusterer - Created new Canopy:0 at center:[0.100, 1.000]
DEBUG CanopyClusterer - Added point: [0.100, 0.900] to canopy: C-0
DEBUG CanopyClusterer - Added point: [0.100, 0.950] to canopy: C-0
DEBUG CanopyClusterer - Created new Canopy:1 at center:[12.300, 12.400]
DEBUG CanopyClusterer - Added point: [12.700, 12.900] to canopy: C-1
DEBUG CanopyDriver - Writing Canopy:C-0 center:[0.100, 0.950] numPoints:3
radius:[1:0.041]
DEBUG CanopyDriver - Writing Canopy:C-1 center:[12.500, 12.650] numPoints:2
radius:[0.200, 0.250]
I wrote a piece of code which shows me 2 clusters with 2 central points:
private final static String partMDir = outputDir + "\\" +
"clusters-0-final" + "\\part-r-00000";
public void printClusters() {
SequenceFile.Reader readerSequence;
try {
readerSequence = new SequenceFile.Reader(fs, new
Path(partMDir),
conf);
Text clusterName = new Text();
ClusterWritable centerVector = new ClusterWritable();
while (readerSequence.next(clusterName, centerVector)) {
System.out.println(centerVector.getValue() + "
is a center of "
+ clusterName);
}
readerSequence.close();
} catch (IOException e) {
e.printStackTrace();
}
}
The result:
C-0: {0:0.10000000000000002,1:0.9499999999999998} is a center of C-0
C-1: {0:12.5,1:12.65} is a center of C-1
I would like to list also all the elements from each cluster. I chcecked few
methods methods from class Text but I did not find anything.
Thank you in advance