On Mon, Aug 9, 2010 at 6:46 PM, Yanir Seroussi <[email protected]> wrote:
> As I see it, there are two separate tasks here. The first one is the
> recommendation task, where it makes sense to take the N most similar users
> and generate recommendations based on their preferences. The second one is
> the rating prediction task (or preference estimation in Mahout terms), where
> given a target user and a target item the recommender generates a prediction
> of the rating for the item. In the way

They are related tasks. Estimating preferences is *a* way to achieve
the further goal of recommending items. It is how just about all
algorithms work, so I'd say that in practice estimating preference is
a sub-task of recommendation, and really the core of it.

> GenericUserBasedRecommendation.estimatePreference() is implemented now, it
> can only generate predictions for very few items unless the neighbourhood
> size is very large.

I don't think this is necessarily true. Sparse data can mean users are
hard to connect to other users, or that those connections yield few
candidates for recommendations, sure. A large neighborhood doesn't
help this though. If there isn't 1 "nearby" user, there aren't 100
either.

But, crudely, I think your data has to be "really really sparse"
before you frequently can't recommend anything at all.

I don't believe it's a question of implementation. This is simply what
user-based recommendation is.



> To make this concrete, I'll give an example. Suppose we choose N=20, and we
> want to predict the rating of the target user for the movie The Godfather.
> This movie was rated by many users in the dataset, but unfortunately it
> wasn't rated by the 20 users that are most similar to the target user. Thus,
> estimatePreference() will return NaN as the prediction.

That's right. There would be no basis for estimating a preference.


> Now suppose that these top 20 users have a similarity of 1 to the target
> user, and that the next 20 users have a similarity of 0.99 to the target
> user, and they have all rated The Godfather. It would make sense to generate
> a rating prediction based on the next 20 users, and this prediction is
> actually likely to be pretty good.

True, and if this were the case, I'd suggest that you want to define
the neighborhood by a similarity threshold, not by size. Let everyone
with similarity >= 0.9 or something be in the neighborhood.


> As I mentioned in my original message, it is implied from Herlocker et al
> (1999) that the number of neighbours shouldn't affect coverage. Another
> widely cited paper, by Adomavicius and Tuzhilin (2005, see
> http://ids.csom.umn.edu/faculty/gedas/papers/recommender-systems-survey-2005.pdf),
> actually states that explicitly: "That is, the
> value of the unknown rating r(c, s) for user c and item s is usually
> computed as an aggregate of the ratings of some other (usually the N most
> similar) users for the same item s: r(c, s) = aggr(c' in C^, r(c', s)) where
> C^ denotes the set of N users that are the most similar to user c and who
> have rated item s (N can range anywhere from 1 to the number of all users)."
>
> While Sean's approach is probably also okay, especially for the
> recommendation scenario, I think it's worthwhile documenting the fact that
> this behaviour is somewhat inconsistent with the literature.

As (I think) you say, it's consistent with Herlocker -- this is just
plain-vanilla user-based recommendation.

The other approach you cite makes sense but gets a lot more expensive
to compute. You need a new neighborhood for every candidate item. It
is also, in my world view, not the plain-vanilla approach that has
become called "user-based recommendation". So, no I do not find this
inconsistent.

Reply via email to