Hi all,

I‘m using  Canopy clustering with cosine similarity measure as input to
kmenas clustering.  I’m wondering how the similarity between documents is
calculated with respect to t1 and t2 parameters.
  Let me say t1=0.8 and t2=0.5. For the cosine similarity if s(d1,d2)>0.8
that means they are much similar, and if s(d1,d2)<0.5,  they are less (not)
similar.

In Canopy algorithm if s(d1,d1)<t2  then assign them(d1 and d2) to the same
canopy.  But In cosine similarity the distance s(d1,d2)<t2-value  which is
0.5  means there is no similarity.

Here I’m asking for clarification that point, May I’m wrong but I would
like to understand that.

Please, if anyone tell me How the cosine similarity is computed wrt t1 and
t2 parameters?

Thanks in advance
Doni

Reply via email to