Hi,

I have the following case where numItems = 1,000,000, prefs1Size = 900,000
and prefs2Size = 100.

It is the case when i have two users, one who has seen 90% of the movies in
the database and another only 100 of the million movies. Suppose they have
90 movies in common (user 2 has seen only 100 movies totally), i would
assume the similarity to be high compared to when they have only 10 movies
in common. But the similarities i am getting are 
0.9971 for intersection size 10 and 
0 for intersection size 90.

This seems counter intuitive. 

Am i missing something? Is there an explanation for the above mentioned
values?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/LogLikelihoodSimilarity-calculation-tp4158035.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Reply via email to