See https://issues.apache.org/jira/browse/MAHOUT-761
On Jul 13, 2011, at 4:31 PM, Jeff Eastman wrote: > Well, distance is dependent upon the distance measure you want to use. A > post-processing step could easily calculate this. The ClusterEvaluator may > have some methods that could be useful. It calculates a set of representative > points for each cluster and calculates interCluster and intraCluster > densities from that. > > -----Original Message----- > From: Grant Ingersoll [mailto:[email protected]] > Sent: Wednesday, July 13, 2011 1:28 PM > To: [email protected] > Subject: Re: Emitting distance from centroid for K-Means > > Good to know. Next question, what's the preferred way, then, to get out > either the distance or what Ted said? > > -Grant > > On Jul 13, 2011, at 4:25 PM, Ted Dunning wrote: > >> I take back what I said. >> >> Jeff is correct. >> >> On Wed, Jul 13, 2011 at 1:23 PM, Jeff Eastman <[email protected]> wrote: >> >>> The weight is the probability the vector is a member of the cluster. For >>> FuzzyK and Dirichlet it is fractional, for KMeans it is 1 as the algorithm >>> is maximum likelihood and each point is only assigned to a single cluster. >>> >>> -----Original Message----- >>> From: Grant Ingersoll [mailto:[email protected]] >>> Sent: Wednesday, July 13, 2011 1:11 PM >>> To: [email protected] >>> Subject: Emitting distance from centroid for K-Means >>> >>> Does it make sense to output the distance to the cluster as the weight in >>> the KMeansClusterer.outputPointWithClusterInfo method instead of 1? What's >>> the purpose of the 1 as the weight? >>> >>> -Grant >>> >>> >>> > > -------------------------- > Grant Ingersoll > > > -------------------------- Grant Ingersoll
