Yes OK I like the idea. If you can precompute these neighborhoods, there is no problem of computing them at runtime, so performance isn't such a big deal. It does add the necessary step or re-computing neighborhoods sometimes.
I still imagine there is a sparseness problem: what if I haven't rated anything in the item's neighborhood, but I have rated some items outside its neighborhood? It's too bad I can't get any estimated preference for that item, but, maybe it's not a very reliable estimate or good recommendation anyway. The problem lessens as the neighborhood grows but then you just approach the current algorithm. I could imagine there is some sweet spot where the performance overhead of re-computing neighborhoods is balanced out by better recommendations (?) and faster runtime processing. I don't know whether it would make better recommendations but have no reason to think the right setting wouldn't. I'm also not sure there is a performance issue for item-based recommendation -- it is generally quite fast since it is generally used when you have some precomputed item similarities to begin with. I wouldn't discourage you from hacking it into the code and seeing how it goes. If you find it has value, by all means let's bother to inject this idea into the implementation. It's just a generalization of what's there now. Sean On Sat, Feb 20, 2010 at 2:46 PM, jamborta <[email protected]> wrote: > > the basic concept of neighbourhood for item-based recommendation comes from > this paper: > > http://portal.acm.org/citation.cfm?id=371920.372071 > > this is the idea: > > "The fact that we only need a small fraction of similar items to compute > predictions leads us to an alternate model-based scheme. In this scheme, we > retain only a small number of similar items. For each item j we compute the > k most similar items. We term k as the model size. Based on this model > building step, our prediction generation algorithm works as follows. For > generating predictions for a user u on item i, our algorithm first > retrieves the precomputed k most similar items corresponding to the target > item i. Then it looks how many of those k items were purchased by the user > u, based on this intersection then the prediction is computed using basic > item-based collaborative filtering algorithm." >
