I think the simple point is that the primary use case of a recommender is to
return the n-best recommended items, rather than return the predicted rating
for a single item.  In that case, if an item can't get a predicted rating
because no users in the neighborhood have rated it, then that lack of
ratings clearly suggests that similar users are not interested in that item! 
If one were to use item-specific user neighborhoods, then some item might
only be rated by a two users who are overall very different to the target
user, but if those two users all rated the item a 5.0, then the recommender
would return a 5.0 (because a weighted average with very small weights is
still just an average -- there is no discounting of the predicted rating
based on the similarity scores), which is intuitively not what you want.

For item-based recommenders, I think the problem of using a fixed "nearest
item" neighborhood is the fact that any particular user will not have
ratings for many of the items in that fixed neighborhood -- which renders
those items being in the neighborhood useless for predicting ratings for
that user.  So in this case, it makes more sense to use the user-rated items
as the neighborhood.  However, in this case, I could see the argument for
putting an upper limit on this neighborhood size, in case the user has rated
a huge number of items.  One could calculate the (e.g.) 500 most similar
items that a particular used has rated, and use that as the neighborhood,
instead of all of the (e.g.) 2,000 items that the user rated.  That would
obviously be a speed optimization analogous to setting the user-user
neighborhood size, rather than something that would be necessarily expected
to improve accuracy.

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/GenericUserBasedRecommender-vs-GenericItemBasedRecommender-tp1565019p1565099.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Reply via email to