What you are mentioning looks a lot like factor analysis, in which you
decompose a space into latent subspaces by doing something like PCA
(Principal Component Analysis); each factor (subspace dimension)
explains a % of the variation in the original variables, and the
coordinates of the resulting vector for each input datapoint express the
degree of assignment to that factor.

This, in turn, is actually very similar to SVD decomposition: doing an
SVD on the user-item matrix is akin to a PCA on the covariance of that
matrix (in turn, the elements in the covariance matrix are the same as
the cross-correlation coefficients if the data is centered). The
singular values would then be equivalent to the principal components.

So I think you *could* use an SVD to find latent spaces, and this is
similar to what Netflix Prize participants published e.g. Matrix
Factorization Techniques For Recommender Systems, by Koren et al, I
don't know if this is what you were commenting about (see
http://www2.research.att.com/~volinsky/papers/ieeecomputer.pdf).

One limitation of that type of analysis is that it assumes that factors
or dimensions can be linearly separated. And yes, the axes are
orthogonal (that's why it's useful, it separates into orthogonal
dimensions) but I don't see why this would be a problem: you *want*
different axes to span on different (as orthogonal as possible)
dimensions, don't you?

Of course the onus is on *you* to interpret the *meaning* of the
resulting axes. Some of them would be straightforward, some not, and
there's a strong chance that you won't find the precise semantic axis
you're trying to find.

If you *do* want pre-determined meanings in certain axis, in principle
you might try to find your prototypes (the movies that define your
desired axis) and try to rotate your PCA matrix to align one axis with
the vector that defines your prototype in the PCA space. Of course, if
you have more than one prototype, and they're not orthogonal, a single
rotation won't do. But I'm wildly speculating here; not sure if this
would be feasible.

Paulo



There is a problem I've wanted to solve for a long time. Suppose you
want to find antipodes in preferences: "axes of interest". In movie
preferences, Star Trek movies (male nerds) v.s. Sex In The City
(middle class women) might be one axis v.s. historical documentaries
v.s. 1950's Douglas Sirk melodramas (don't ask). These axes are not
orthogonal. (I saw this analysis in a presentation by one of the
Netflix Competition finalists. Unfortunately, I did not ask him how
to make it.)

Thank you for this hint. Negative correlations make this possible.
Given an item-item matrix of Pearson distances, how would you isolate
these axes? The minimum and maximum movies are easy to find. Each
axis endpoint is a small cluster inside a genre. How would you find
these small clusters? They're not orthogonal, so a naive SVD would
not help. What is a good algorithm for this?

Lance

----- Original Message ----- | From: "Paulo Villegas" <[email protected]>
| To: [email protected] | Sent: Monday, November 26, 2012
2:03:59 PM | Subject: Re: Recommender's formula | | [...] | | They
can be negative for certain similarity metrics, most notably |
Pearson (which has sign, negative similarities express negative |
correlations), other similarity metrics are strictly positive and |
therefore do not present that problem. | | [...]



________________________________

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar 
nuestra política de envío y recepción de correo electrónico en el enlace 
situado más abajo.
This message is intended exclusively for its addressee. We only send and 
receive email on the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx

Reply via email to