I am not 100% on how Mahout implementation of KMeans algorithm does this, but in general, cluster center is the centroid of all the points that belong to that cluster. In the simplest case, it will just be the average of all the points that belong to that cluster. Next, it could be an actual point that is closest to the centroid.
On Thu, Aug 22, 2013 at 6:58 AM, Grant Ingersoll <[email protected]>wrote: > > On Aug 12, 2013, at 5:12 PM, William Moran <[email protected]> wrote: > > > Hi, > > > > What exactly are the numbers next to these terms? (this is an example > > clusterdump from the Mahout in Action book, but my clusterdumps look > > similar). > > They are the weights assigned to each of the terms. They are likely the > TF/IDF values, but I believe they may be other things depending on how your > dictionary/vectors were created. > > > > > Top Terms: > > > > Shania Twain => 1.126984126984127 > > Garth Brooks => 0.746031746031746 > > Sara Evans => 0.6031746031746031 > > Lonestar => 0.5238095238095238 > > > > Sorry if this is an obvious question but I find it hard to find details > on > > these specifics. > > > > Many thanks, > > > > Will > > -------------------------------------------- > Grant Ingersoll | @gsingers > http://www.lucidworks.com > > > > > >
