Interesting question. So the preferences are synthetic in some cases -- you have a pref for ever user-item combination? (Then what do you recommend? but I can imagine some answers.)
By "not work well" do you mean performance or accuracy? For performance, yes, having very dense input will really slow down the pre-computation step, which is more or less linear in the size of the input. The resulting diffs table is usually dense-ish, since an entry exists any time two items co-occur; in this case it would be completely filled. This would also slow down things at runtime. This is all a symptom of having such dense data. One answer would be to 'prune' noise from your data (or generate less synthetic data, if I guess that right). Another answer is to prune the diffs table. The least interesting entries are those with highest standard deviation. You could hack the code to trim based on that to get better runtime performance. If you mean accuracy, then one guess is that the big assumption that slope-one makes for the input isn't valid for your data. Slope-one assumes that the ratings for item X and item Y are linearly related: Y = mX + b. Rather than spend time regressing to determine m and b for each pair, which would be hugely expensive, it makes the reasonable assumption that m=1 in all cases. So the problem is vastly simpler: computing the best b = Y-X, which is just the average difference across all X / Y prefs. That's a good assumption for most "normal" scenarios. But to the extent it's systematically not true of your data, this will fall apart. Since I am guessing much data is synthetic, that's why I wonder if there is some systematic incompatibility with this assumption. On Wed, Oct 6, 2010 at 5:37 AM, Lance Norskog <[email protected]> wrote: > I'm working with a DataModel that estimates preferences for all items > from any user. This seems to not work well with the SlopeOne > recommender. Are there tips&tricks for making recommenders work well > with this class of model? That is, the sample datamodels all seem to > explicitly store items and only return those prefs My model cheerfully > generates 1000 preferences if there are 1000 items. > > Thanks, > > -- > Lance Norskog > [email protected] >
