As part of evaluating cluster quality, I'd like to implement a bunch of quality measures, especially external ones.
The one that I think would be particularly useful is the Adjusted Rand Index [1]. Using a contingency table with the partitions from 2 clusterings, this returns a value from 0 to 1 (higher being better) corresponding to the similarity of the partitions. First of all, I'd like to know your thought on using ARI as a metric. Second, there's an implementation of ConfusionMatrix that is NxN. I'd like to extend this class to support unlabeled partitions of different sizes and add a method that computes the ARI. What are your thoughts? [1] http://en.wikipedia.org/wiki/Rand_index
