Summarize yes. But this is, actually, theoretically better because the summarization introduces useful smoothing. That way you get recommendations for items even if there is no direct overlap.
Your point about noisy is trenchant because small count data is inherently noisy because you can't have an exact 0.04 of an observation. Small counts dominate in recommendations. On Fri, Nov 27, 2009 at 10:00 PM, Sean Owen <sro...@gmail.com> wrote: > > Correct me if I'm wrong, but my impression of matrix factorization > approaches is that they're just a way to effectively "summarize" input > data. They're not a theoretically better, or even different, approach > to recommendation, but more a transformation of the input into > conventional algorithms. (Though this process of simplification could, > I imagine, sometimes be an improvement on the input, if it's noisy.) -- Ted Dunning, CTO DeepDyve