Thank you for the information.
Following your answer, the top terms from the clusters have similar
frequencies.
As I used the cosine distance as the measure is this correct result?

Thank You.

Jung Hoon Sohn

On Sun, Oct 7, 2012 at 9:35 PM, paritosh ranjan
<[email protected]>wrote:

> The top terms come from the centroid of the cluster. These values are the
> term frequencies.
>
> On Sun, Oct 7, 2012 at 5:38 PM, jung hoon sohn <[email protected]> wrote:
>
> > Hello,
> > I used k-means algorithm to cluster the text terms in the documents
> > according to the cosine distance measure.
> > It ran successfully and when we ran the clusterdump utility to see the
> top
> > terms per each clusters,
> > I get the output such as
> >
> >       Top Terms:
> >
> >             hello    =>     21.8977799999
> >             you     =>     11.9284304939
> >             ....
> >
> > I am guessing the value next to the each terms are cosine distance values
> > but not very sure about it.
> > Does anyone know specifically what does the value represent?
> >
> > Thanks.
> >
> > Jung Hoon Sohn
> >
>

Reply via email to