Hi Dmitriy,

I am following the same calculation used in the userSimilarity method in
LogLikelihoodSimilarity.java

k11 = intersectionSize       (both users viewed movie)

k12 = prefs2Size - intersectionSize   (only viewed by user 2)

k21 = prefs1Size - intersectionSize    (only viewed by user 1)

k22 = numItems- prefs1Size - prefs2Size + intersectionSize  (not viewed by
both 1 and 2)


Thanks,

Aishwarya

On Wed, Sep 10, 2014 at 2:25 PM, Dmitriy Lyubimov <[email protected]> wrote:

> how do you compute k11, k12... values exactly?
>
> On Wed, Sep 10, 2014 at 1:55 PM, aishsesh <
> [email protected]> wrote:
>
> > Hi,
> >
> > I have the following case where numItems = 1,000,000, prefs1Size =
> 900,000
> > and prefs2Size = 100.
> >
> > It is the case when i have two users, one who has seen 90% of the movies
> in
> > the database and another only 100 of the million movies. Suppose they
> have
> > 90 movies in common (user 2 has seen only 100 movies totally), i would
> > assume the similarity to be high compared to when they have only 10
> movies
> > in common. But the similarities i am getting are
> > 0.9971 for intersection size 10 and
> > 0 for intersection size 90.
> >
> > This seems counter intuitive.
> >
> > Am i missing something? Is there an explanation for the above mentioned
> > values?
> >
> >
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/LogLikelihoodSimilarity-calculation-tp4158035.html
> > Sent from the Mahout User List mailing list archive at Nabble.com.
> >
>

Reply via email to