The top terms come from the centroid of the cluster. These values are the term frequencies.
On Sun, Oct 7, 2012 at 5:38 PM, jung hoon sohn <[email protected]> wrote: > Hello, > I used k-means algorithm to cluster the text terms in the documents > according to the cosine distance measure. > It ran successfully and when we ran the clusterdump utility to see the top > terms per each clusters, > I get the output such as > > Top Terms: > > hello => 21.8977799999 > you => 11.9284304939 > .... > > I am guessing the value next to the each terms are cosine distance values > but not very sure about it. > Does anyone know specifically what does the value represent? > > Thanks. > > Jung Hoon Sohn >
