Based on my quick understanding, while using item to item collaborative
filtering, you use those items which have been rated by the user and find
their similarity with the required item for computing the required item's
similarity.

On the other hand when using user based collaborative filtering, you always
use the same neighborhood set, irrespective of whether that user has rated
the particular item or not. You check for this in the code, but your
neighborhood set is always the same n members, where it may happen that no
one has rated the item.

To put it across mathematically, (let r denote the rating, s denote the
similarity)

item-item cf -
r(user, item_i) = SUM_(all j rated by user)  s(item_i, item_j) * r(user,
item_j)

user-user cf -
r(user_i, item) = SUM_(all j in user_i neihborhood) s(user_i, user_j) *
r(user_j, item)

Why don't we use the same approach in user to user cf? Why not try to
retrieve those users who have rated the particular item in question and then
either take the k closest ones or based on some threshold? Or, why not use
the same user to user logic in item-item cf? 

I know that you can argue users tend to be similar because of their general
taste in items. And this similarity is not much determined by individual
similar items. So, it makes more sense to use a constant neighborhood for
users, irrespective of which item we are trying to rate. But, the same logic
can be used when it comes to item-item cf. The items inherent features are
determined by how many users rate it together, and blah blah..

I may be missing a simple point here, but I am unable to figure out why keep
different implementations for the two? Please correct me if I am wrong in my
observation and two codes are the same!

Thanks!
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/GenericUserBasedRecommender-vs-GenericItemBasedRecommender-tp1565019p1565019.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Reply via email to