Re: user-user recommendations

Sean Owen Fri, 18 Feb 2011 23:20:45 -0800

User-user similarity is based on these counts? That sounds a bit like
the Tanimoto / Jaccard coefficient.See TanimotoCoeffcientSimilarity.
Yes you can use that though log-likelihood is probably a more
sophisticated choice.

Recommending an item that occurs most in the neighborhood? Sure you
can make it work that way. It probably works "OK" in practice though
you can see possible problems with it. What if everyone in the
neighborhood hates an item? this would recommend it highly. It's also
throwing away the degree of similarity to the user who likes an item.

The conventional wisdom in recommenders is that you want to fight the
tendency to always recommend well-known items. People probably already
know about the well-known items even if they've not rated them yet. It
also makes the recommendations less personalized in a sense -- the
recommendation result approaches the one you'd get by just
recommending the globally most-preferred items.

If your goal is to fight sparseness, start looking at SVD-based
methods. This is really the point of SVDs, to "summarize" a very
high-dimensional user-item matrix in a much lower-dimensional "user
group" - "item group" matrix. Maybe you don't have enough information
to recommend Bauhaus to Joan, a teenage goth, but, the SVD lets you
sort of draw conclusions like "gothy teens like Peter Murphy's
albums". That is the summary is much less sparse and so works better
for recommendation for users/items with little connection to the rest
of the matrix otherwise.

On Sat, Feb 19, 2011 at 2:43 AM, Chris Schilling <[email protected]> wrote:
> Hello again,
>
> Very simple question here:  I am also testing the user-user cf in mahout.  
> So, once I define my user neighborhood, is it possible to select the 
> recommendations from that based on the number of preferences per item rather 
> than a weighted average?  Basically, I'd like to recommend the items with the 
> most preferences.  It would be simple to implement, so I was curious if this 
> was already possible.  I understand that in this case, the counts become 
> dependent on the size of the neighborhood. This is something I'd want to use 
> for testing.
>
> Thanks
> Chris

Re: user-user recommendations

Reply via email to