Re: [jira] Commented: (MAHOUT-137) Convert Clustering Algs to use Vector Writable

Grant Ingersoll Tue, 23 Jun 2009 08:41:43 -0700


On Jun 23, 2009, at 11:18 AM, Jeff Eastman wrote:

That makes sense, though I don't understand why the reducer is notdoing its job in the test you cite. I've had to do manual things(like calling close() in the unit tests to get all of thefunctionality to exercise.All of the clustering algorithms behave similarly: each cluster hasa center (prior) which is used to observe some of the data(observations) based upon a distance function (pdf), which is usedto compute its new centroid (posterior). I think it is possible toabstract them into a common framework using this model.

It makes sense b/c the M/R pieces rely on the fact that everythinground trips through the serialization/deserialization phase, whereasthat particular test does not do that. The centroid from oneiteration thus becomes the center for the next iteration, AFAICT.

Re: [jira] Commented: (MAHOUT-137) Convert Clustering Algs to use Vector Writable

Reply via email to