Both H1 and H2 seek to balance inter and intra cluster similarity in arriving at their evaluation of a cluster solution. For both criterion functions the goal is to maximize their value.
H1 = I1/E1 H2 = I2/E1 So, both are relying on E1 to measure inter clustering similarity, which consists of minimizing the cosine between the angle of the cluster centroids and the overall centroid (which will result in the greatest angles between the centroids). H1 and H2 differ in how they measure intra cluster similarity - H1 is relying on I1 (the ball of string) while H2 relies on I2 (the flower). Put another way, H1 seeks to maximize the pairwise similarities between the contexts in each cluster (I1) while minimizing the cosine between the centroids of the clusters and the overall collection centroid (E1). H2 seeks to maximize the pairwise similarities between the centroids of each cluster and the contexts therein (I2), while minimizing the cosine between the centroids of the clusters and the overall collection centroid (E1). In both cases (H1 and H2) the goal is quite simple - maximize intra cluster similarity, while maximizing inter cluster differences. In other words, find tight clusters that are far apart from each other. In many respects it seems like H2 might be a very good candidate for use as a criterion function in general. Since it relies on centroid computations only and does not do exhaustive pairwise comparisons it is a bit more efficient than H1, and in principle it seems to make some sense. So perhaps in addition to I2 it would make sense to try H2 from time to time in experiments. H1 and I1 are also interesting, although I think both have a bias towards finding clusters of the same size, which might not be exactly what we want. Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ senseclusters-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/senseclusters-users
