Also, if cluster training begins with the posterior from a previous training 
session over the corpus but with new data added since that training began, the 
prior clusters should be very close to an optimal solution with the new data 
and the number of iterations required to converge on a new posterior should be 
reduced. Haven't tried this in practice but it seems logical. Convergence is 
calculated by how much each cluster has changed during an iteration.

-----Original Message-----
From: Benson Margulies [mailto:[email protected]] 
Sent: Thursday, May 12, 2011 9:14 AM
To: [email protected]
Subject: Re: AW: Incremental clustering

Is the idea here that you are going to be presented with many
different corpora that have some sort of overall resemblance, so that
priors derived from the first N speed up clustering N+1?

--benson

Reply via email to