These all depend on gold standards.  If you have those, then it is easy to
evaluate a clustering.

What is hard is to evaluate a clustering without a standard.  I have done
this, somewhat, in the past by looking at stability over time in terms of
cluster size and membership.  I have also looked at the utility of cluster
membership in predicting objective attributes not used in the clustering.
The stability criteria might apply to some of our data sets.  The utility
measure only works in a modeling setting.

On Tue, Aug 18, 2009 at 7:32 AM, Grant Ingersoll <[email protected]>wrote:

> Also found:
> http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html
>
>
> On Aug 18, 2009, at 9:55 AM, Grant Ingersoll wrote:
>
>
>> On Jul 27, 2009, at 9:42 PM, Ted Dunning wrote:
>>
>>  The other reference I am looking for may be in David Mackay's book.  The
>>> idea is that you measure the quality of the approximation by looking at
>>> the
>>> entropy in the cluster assignment relative to the residual required to
>>> precisely specify the original data relative to the quantized value.
>>>
>>
>> Is the WM Rand paper in JSTOR ("Object Criteria for Evaluation of
>> Clustering Methods") worthwhile on this topic?  Basic searches for
>> "evaluating clustering" or "cluster evaluation" on Google Scholar turn up
>> very little.  The Rand paper is from 1971, but who knows...
>>
>> Of course, I'd like something that doesn't require purchase (sigh.)
>>
>
>
>


-- 
Ted Dunning, CTO
DeepDyve

Reply via email to