Hi Dmitriy, I am following the same calculation used in the userSimilarity method in LogLikelihoodSimilarity.java
k11 = intersectionSize (both users viewed movie) k12 = prefs2Size - intersectionSize (only viewed by user 2) k21 = prefs1Size - intersectionSize (only viewed by user 1) k22 = numItems- prefs1Size - prefs2Size + intersectionSize (not viewed by both 1 and 2) Thanks, Aishwarya On Wed, Sep 10, 2014 at 2:25 PM, Dmitriy Lyubimov <[email protected]> wrote: > how do you compute k11, k12... values exactly? > > On Wed, Sep 10, 2014 at 1:55 PM, aishsesh < > [email protected]> wrote: > > > Hi, > > > > I have the following case where numItems = 1,000,000, prefs1Size = > 900,000 > > and prefs2Size = 100. > > > > It is the case when i have two users, one who has seen 90% of the movies > in > > the database and another only 100 of the million movies. Suppose they > have > > 90 movies in common (user 2 has seen only 100 movies totally), i would > > assume the similarity to be high compared to when they have only 10 > movies > > in common. But the similarities i am getting are > > 0.9971 for intersection size 10 and > > 0 for intersection size 90. > > > > This seems counter intuitive. > > > > Am i missing something? Is there an explanation for the above mentioned > > values? > > > > > > > > -- > > View this message in context: > > > http://lucene.472066.n3.nabble.com/LogLikelihoodSimilarity-calculation-tp4158035.html > > Sent from the Mahout User List mailing list archive at Nabble.com. > > >
