The best measure of quality for lots of purposes is average distance to the nearest cluster for *unseen* data.
There hasn't been a lot of work on this, but it should be easy to retrofit the new data distance metric into a classifier. On Tue, May 15, 2012 at 3:46 PM, Pat Ferrel <[email protected]> wrote: > So many questions about best k, how to choose t1 and t2, how much help is > dimensional reduction would have clear answers if we had a way to judge the > quality of clusters. > > Various methods were discussed here for a time: > http://www.lucidimagination.**com/search/document/** > dab8c1f3c3addcfe/validating_**clustering_output<http://www.lucidimagination.com/search/document/dab8c1f3c3addcfe/validating_clustering_output> > > Has there been any work on building a measure of quality? > >
