Thank you for the information. Following your answer, the top terms from the clusters have similar frequencies. As I used the cosine distance as the measure is this correct result?
Thank You. Jung Hoon Sohn On Sun, Oct 7, 2012 at 9:35 PM, paritosh ranjan <[email protected]>wrote: > The top terms come from the centroid of the cluster. These values are the > term frequencies. > > On Sun, Oct 7, 2012 at 5:38 PM, jung hoon sohn <[email protected]> wrote: > > > Hello, > > I used k-means algorithm to cluster the text terms in the documents > > according to the cosine distance measure. > > It ran successfully and when we ran the clusterdump utility to see the > top > > terms per each clusters, > > I get the output such as > > > > Top Terms: > > > > hello => 21.8977799999 > > you => 11.9284304939 > > .... > > > > I am guessing the value next to the each terms are cosine distance values > > but not very sure about it. > > Does anyone know specifically what does the value represent? > > > > Thanks. > > > > Jung Hoon Sohn > > >
