I think that total distance of documents to the nearest cluster is an
interesting convergence measure as well.  It should bottom out to some
asymptote as the clustering proceeds.

On Sat, Jan 2, 2010 at 6:34 PM, Drew Farris <[email protected]> wrote:

> On Sat, Jan 2, 2010 at 8:31 PM, Bogdan Vatkov <[email protected]
> >wrote:
>
> > Still, is there a way to print out the current convergence after each
> > iteration or something?
> >
>
> Each cluster has its own convergence which is defined as the distance
> between its center and its centroid. As a result, overall convergence is a
> binary measure defined as whether all clusters are converged -- whether
> each
> cluster's convergence is less-than or equal to the convergence delta.
>
> If you are interested in the convergence measure for each cluster, you will
> need to modify computeConvergence() in o.a.m.clustering.kmeans.Cluster to
> either store or log the convergence.
>
> If there's sufficient interest in this, I can prep a patch that will allow
> convergence to be stored and dumped via ClusterDumper
>
> Drew
>



-- 
Ted Dunning, CTO
DeepDyve

Reply via email to