Re: NMF for Taste

Jake Mannix Sat, 28 Nov 2009 08:32:15 -0800

On Fri, Nov 27, 2009 at 11:23 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> Summarize yes.
>

> But this is, actually, theoretically better because the summarization
> introduces useful smoothing.  That way you get recommendations for items
> even if there is no direct overlap.
>

Summarize, smooth, and enhance clustering: distances are *not* preserved in
truncated decompositions, and the *hope* is that the "meaningful" distances
are
decreased, and the less meaningful distances are not.  This can be seen in
a simple example of user preferences (on the netflix scale of 1-5)

user1: item1 = 4, item2: 1, item3: 5
user2: item2: 1, item5:1, item7: 3, item8: 3
user3: item4: 4, item5: 1, item6: 5

A first-order recommender won't be able to infer any similarity or
dissimilarity between
user1 and user 3 (although it can tell some similarity between 1 and 2 and 3
and 2).
A decomposing recommender will notice that user1 and 2 both hated item2, and
that
another item which user2 hated was the same item that user3 hated, and infer

transitive similarity, not just to 2nd degree as in this example, but to
nth-order.

The difference between the various decompositional approaches is how they
approximate these transitive similarities - LDA would be best in the very
low
overlap case, and SVD (or more precisely, a sparse SVD which doesn't treat
missing data as the numerical 0 or mean of the values) approaching that
level
of quality in the bigger data case (but SVD / randomized SVD should be a lot
faster than LDA on the big big data case).

What I'd really like to see (once I get this decomposer stuff in - soon!
We've
got good linear primitives now, so I'm working on it!) is also a Restricted
Boltzmann
Machine based recommender, because this makes the final leap from linear
and quasi-linear decompositions to the truly nonlinear case (my friend on
the
executive team over at Netflix tells me that it was pretty apparent that the

winners were going to be blendings of the RBM and SVD-based approaches
pretty early on - and he was right!)

  -jake

> Your point about noisy is trenchant because small count data is inherently
> noisy because you can't have an exact 0.04 of an observation.  Small counts
> dominate in recommendations.
>
> On Fri, Nov 27, 2009 at 10:00 PM, Sean Owen <sro...@gmail.com> wrote:
>
> >
> > Correct me if I'm wrong, but my impression of matrix factorization
> > approaches is that they're just a way to effectively "summarize" input
> > data. They're not a theoretically better, or even different, approach
> > to recommendation, but more a transformation of the input into
> > conventional algorithms. (Though this process of simplification could,
> > I imagine, sometimes be an improvement on the input, if it's noisy.)
>
>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>

Re: NMF for Taste

Reply via email to