On Thu, Sep 23, 2010 at 2:35 AM, gabeweb <[email protected]> wrote: > I think the simple point is that the primary use case of a recommender is to > return the n-best recommended items, rather than return the predicted rating > for a single item. In that case, if an item can't get a predicted rating > because no users in the neighborhood have rated it, then that lack of > ratings clearly suggests that similar users are not interested in that item!
That's right. I'm guessing the idea was to make a hybrid approach. Still build a neighborhood and pick candidates from there, but compute an estimate over everyone (neighborhood or not) that rated the item. That is coherent, but, you're still picking a neighborhood but now basing similarity computation on the tastes of potentially quite dissimilar users. It might not be a good thing. In any event I don't think this hybrid is then "symmetric" with item-based recommenders, yes. > For item-based recommenders, I think the problem of using a fixed "nearest > item" neighborhood is the fact that any particular user will not have > ratings for many of the items in that fixed neighborhood -- which renders > those items being in the neighborhood useless for predicting ratings for You mean, start with the user-rated items as candidate items, then use a neighborhood of items around those as the basis for a similarity computation? Yes exactly, that doesn't work. The user probably hasn't rated much in that neighborhood. > that user. So in this case, it makes more sense to use the user-rated items > as the neighborhood. However, in this case, I could see the argument for > putting an upper limit on this neighborhood size, in case the user has rated (Oops maybe that's not what you were getting at.) > a huge number of items. One could calculate the (e.g.) 500 most similar > items that a particular used has rated, and use that as the neighborhood, > instead of all of the (e.g.) 2,000 items that the user rated. That would > obviously be a speed optimization analogous to setting the user-user > neighborhood size, rather than something that would be necessarily expected > to improve accuracy.
