Re: [Taste] Sanity Check and Questions

Sean Owen Fri, 21 Aug 2009 02:02:48 -0700

The only piece of what you're saying that sort of doesn't click with
me is assuming a prior of 0, or some small value. Similarities range
from -1 to 1, so a value of 0 is a positive statement that there is
exactly no relationship between the ratings for both items. This is
different than having no opinion on it and it matters to the
subsequent calculations.

But does your second suggest really meaningfully change the result...
yeah I push the rating down from 5 to 4.5, but throwing in these other
small terms in the average does roughly the same to all the values? I
perhaps haven't thought this through. I understand the intuitions here
and think they are right.

One meta-issue here for the library is this: there's a 'standard'
item-based recommender algorithm out there, and we want to have that.
And we do. So I don't want to touch it -- perhaps add some options to
modify its behavior. So we're talking about maybe inventing a variant
algorithm... or three or four. That's good I guess, though not exactly
the remit I had in mind for the CF part. I was imagining it would
provide access to canonical approaches with perhaps some small
variants tacked on, or at least hooks to modify parts of the logic.

Basically I also need to have a think about how to include variants
like this in a logical way.

On Thu, Aug 20, 2009 at 6:07 PM, Mark Desnoyer<[email protected]> wrote:
> You could do it that way, but I don't think you're restricted to ignoring
> the rating values. For example, you could define similarity between item i
> and item j like (the normalization is probably incomplete, but this is the
> idea):
>
> similarity(i,j) = (prior + sum({x_ij})) / (count({x_ij}) + 1)
>
> where each x_ij is the similarity defined by a single user and could be
> based on their ratings. So I think the way you're thinking of it, x_ij = 1,
> but it could be a function of the ratings, say higher if the ratings are
> closer and lower if they are far apart.
>
> You can still do the weighted average, you just have more items to
> calculate. Say a user has rated the Liconln book 5 a book on france 3 and a
> book on space travel 1. Assuming there is no data linking the france or
> space books to the cookbook, then their similarities would be the prior, or
> 0.01. Then, you'd calculate the score for the cookbook recommendation as:
>
> score(cookbook) = 5*0.1 + 3*0.01 + 1*0.01 / (0.1 + 0.01 + 0.01) =  4.5

Re: [Taste] Sanity Check and Questions

Reply via email to