Do you want a hierarchy without overlap? If yes, then you should not use FuzzyK, or use it only at the lowest level.
Clustering Algorithm on parent data -> ClusterOutputPostProcessorDriver -> Clustering Algorithm on sub clusters -> ClusterOutputPostProcessorDriver -> Clustering Algorithm on sub clusters ...
That's how it is supposed to work. You can use different clustering algorithms at different levels.
On 20-04-2012 17:06, ivan obeso wrote:
I have clusterized a few documents using fuzzy clustering, so now i have a large amount of clusters with the id of the documents contained in each one. In order to give some structure to this data, I have used the ClusterOutputPostProcessorDriver class to discover the overlaps between clusters, but its a low level solution because if cluster A overlaps with cluster B, then cluster B overlaps with A too, so its a little difficult to stablish a hierarchy. I would like to know if theres a simple way to obtain the cluster tree of this data. I am using mahout v0.5. Sorry for my english. [http://www.cs.nyu.edu/~davise/om-dist/node4.html]
