Does nobody have any thoughts on this? Ted, please? :)
On Fri, Apr 5, 2013 at 2:47 PM, Dan Filimon <[email protected]>wrote: > As part of evaluating cluster quality, I'd like to implement a bunch of > quality measures, especially external ones. > > The one that I think would be particularly useful is the Adjusted Rand > Index [1]. > Using a contingency table with the partitions from 2 clusterings, this > returns a value from 0 to 1 (higher being better) corresponding to the > similarity of the partitions. > > First of all, I'd like to know your thought on using ARI as a metric. > > Second, there's an implementation of ConfusionMatrix that is NxN. I'd like > to extend this class to support unlabeled partitions of different sizes and > add a method that computes the ARI. > > What are your thoughts? > > [1] http://en.wikipedia.org/wiki/Rand_index >
