Since I have a synthetic predictor built-in to the DataModel, do I need a Recommender?
On Wed, Oct 6, 2010 at 5:20 AM, Sean Owen <[email protected]> wrote: > Interesting question. So the preferences are synthetic in some cases -- you > have a pref for ever user-item combination? (Then what do you recommend? but > I can imagine some answers.) > > > By "not work well" do you mean performance or accuracy? > > > For performance, yes, having very dense input will really slow down the > pre-computation step, which is more or less linear in the size of the input. > The resulting diffs table is usually dense-ish, since an entry exists any > time two items co-occur; in this case it would be completely filled. This > would also slow down things at runtime. > > This is all a symptom of having such dense data. One answer would be to > 'prune' noise from your data (or generate less synthetic data, if I guess > that right). > > Another answer is to prune the diffs table. The least interesting entries > are those with highest standard deviation. You could hack the code to trim > based on that to get better runtime performance. > > > If you mean accuracy, then one guess is that the big assumption that > slope-one makes for the input isn't valid for your data. Slope-one assumes > that the ratings for item X and item Y are linearly related: Y = mX + b. > Rather than spend time regressing to determine m and b for each pair, which > would be hugely expensive, it makes the reasonable assumption that m=1 in > all cases. So the problem is vastly simpler: computing the best b = Y-X, > which is just the average difference across all X / Y prefs. > > That's a good assumption for most "normal" scenarios. But to the extent it's > systematically not true of your data, this will fall apart. Since I am > guessing much data is synthetic, that's why I wonder if there is some > systematic incompatibility with this assumption. > > > On Wed, Oct 6, 2010 at 5:37 AM, Lance Norskog <[email protected]> wrote: > >> I'm working with a DataModel that estimates preferences for all items >> from any user. This seems to not work well with the SlopeOne >> recommender. Are there tips&tricks for making recommenders work well >> with this class of model? That is, the sample datamodels all seem to >> explicitly store items and only return those prefs My model cheerfully >> generates 1000 preferences if there are 1000 items. >> >> Thanks, >> >> -- >> Lance Norskog >> [email protected] >> > -- Lance Norskog [email protected]
