Hi Jeff, This is an good paper with a simple measure of cluster quality measurement based on intra cluster density and inter cluster separation. Its pretty easy to compute. Need to make it a map/reduce job http://docs.google.com/viewer?a=v&q=cache:z5p9n04cBQEJ:www.db-net.aueb.gr/index.php/corporate/content/download/227/833/file/HV_poster2002.pdf+clustering+quality&hl=en&gl=in&pid=bl&srcid=ADGEESiC-ocW6IWrKR4cb1t1ZqkzRKQ3tDv4UFBkVaUKU0gG3kADcPWIjs-60A0912nu8MFPsVM3pf9jKrP98dL-B-BaiOC9LObBS3VkJK6Mu6josZtVegLxp3BftduD3hFxtGOVZK_b&sig=AHIEtbSZwtgw9wmJoojQn7Dlz5OL67vICw Robin
On Wed, Apr 7, 2010 at 7:03 AM, Jeff Eastman <j...@windwardsolutions.com>wrote: > Hi Robin, > > Great! I've got the refactoring changes for consolidating all the various > cluster types under a Cluster interface (formerly Printable but now with id, > numPoints and a center added). Dirichlet models still don't yet have > meaningful ids implemented but they all do (so far anyway) have a notion of > "numPoints" and a "center". I'm working on tests tomorrow to make sure the > ClusterDumper actually works with Dirichlet clusters then I will commit > that. Wednesday or Thursday most likely. > > BTW, I changed my mind about foisting off the old Printable interface on > Vectors (but am still open to the idea if somebody actually working in math > thinks it is worth doing). All the new Clusters use the vector formatting > done in ClusterBase. > > What I'd really like is feedback from ClusterDumper users on what is > working and what is needed to address MAHOUT-236. That includes you, right? > > Jeff > > PS: Ted, you expressed some doubts about the value of consolidating > Dirichlet clusters with the others. So far it seems to be a reasonable fit > but I'm doing the engineering on a tiny subset of simple models without > enough theoretical insight to see any pitfalls ahead. Is there a > "DistanceMeasure-like" discussion that might provide a firmer underpinning > for this work? > > > > > Robin Anil wrote: > >> No one yet. I am willing to help In case you need an extra pair of hands >> on >> this one. >> >> Robin >> >> >