This is just one of an infinite number of variations on item-based recommendation. The general idea is that you do some kind of magic to find item-item connections, you trim those to make it all work and then you recommend the items linked from the user's history of items they liked. If the budget runs out (time, space or $), then you trim more. All that the grouplens guys are saying is that trimming didn't hurt accuracy so it is probably good to do.
The off-line connection finding can be done using LLR (for moderately high traffic situations), SVD (for cases where transitive dependencies are important), random indexing (poor man's SVD) or LDA (where small counts make SVD give crazy results). There are many other possibilities as well. It would be great if you felt an itch to implement some of these and decided to scratch it and contribute the results back to Mahout. On Sat, Feb 20, 2010 at 6:46 AM, jamborta <[email protected]> wrote: > > the basic concept of neighbourhood for item-based recommendation comes from > this paper: > > http://portal.acm.org/citation.cfm?id=371920.372071 > > this is the idea: > > "The fact that we only need a small fraction of similar items to compute > predictions leads us to an alternate model-based scheme. In this scheme, we > retain only a small number of similar items. For each item j we compute the > k most similar items. We term k as the model size. Based on this model > building step, our prediction generation algorithm works as follows. For > generating predictions for a user u on item i, our algorithm first > retrieves the precomputed k most similar items corresponding to the target > item i. Then it looks how many of those k items were purchased by the user > u, based on this intersection then the prediction is computed using basic > item-based collaborative filtering algorithm." > > -- > View this message in context: > http://old.nabble.com/item-based-recommendation-neighbourhood-size-tp27661482p27666954.html > Sent from the Mahout User List mailing list archive at Nabble.com. > > -- Ted Dunning, CTO DeepDyve
