What you are mentioning looks a lot like factor analysis, in which you decompose a space into latent subspaces by doing something like PCA (Principal Component Analysis); each factor (subspace dimension) explains a % of the variation in the original variables, and the coordinates of the resulting vector for each input datapoint express the degree of assignment to that factor.
This, in turn, is actually very similar to SVD decomposition: doing an SVD on the user-item matrix is akin to a PCA on the covariance of that matrix (in turn, the elements in the covariance matrix are the same as the cross-correlation coefficients if the data is centered). The singular values would then be equivalent to the principal components. So I think you *could* use an SVD to find latent spaces, and this is similar to what Netflix Prize participants published e.g. Matrix Factorization Techniques For Recommender Systems, by Koren et al, I don't know if this is what you were commenting about (see http://www2.research.att.com/~volinsky/papers/ieeecomputer.pdf). One limitation of that type of analysis is that it assumes that factors or dimensions can be linearly separated. And yes, the axes are orthogonal (that's why it's useful, it separates into orthogonal dimensions) but I don't see why this would be a problem: you *want* different axes to span on different (as orthogonal as possible) dimensions, don't you? Of course the onus is on *you* to interpret the *meaning* of the resulting axes. Some of them would be straightforward, some not, and there's a strong chance that you won't find the precise semantic axis you're trying to find. If you *do* want pre-determined meanings in certain axis, in principle you might try to find your prototypes (the movies that define your desired axis) and try to rotate your PCA matrix to align one axis with the vector that defines your prototype in the PCA space. Of course, if you have more than one prototype, and they're not orthogonal, a single rotation won't do. But I'm wildly speculating here; not sure if this would be feasible. Paulo
There is a problem I've wanted to solve for a long time. Suppose you want to find antipodes in preferences: "axes of interest". In movie preferences, Star Trek movies (male nerds) v.s. Sex In The City (middle class women) might be one axis v.s. historical documentaries v.s. 1950's Douglas Sirk melodramas (don't ask). These axes are not orthogonal. (I saw this analysis in a presentation by one of the Netflix Competition finalists. Unfortunately, I did not ask him how to make it.) Thank you for this hint. Negative correlations make this possible. Given an item-item matrix of Pearson distances, how would you isolate these axes? The minimum and maximum movies are easy to find. Each axis endpoint is a small cluster inside a genre. How would you find these small clusters? They're not orthogonal, so a naive SVD would not help. What is a good algorithm for this? Lance ----- Original Message ----- | From: "Paulo Villegas" <[email protected]> | To: [email protected] | Sent: Monday, November 26, 2012 2:03:59 PM | Subject: Re: Recommender's formula | | [...] | | They can be negative for certain similarity metrics, most notably | Pearson (which has sign, negative similarities express negative | correlations), other similarity metrics are strictly positive and | therefore do not present that problem. | | [...]
________________________________ Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política de envío y recepción de correo electrónico en el enlace situado más abajo. This message is intended exclusively for its addressee. We only send and receive email on the basis of the terms set out at: http://www.tid.es/ES/PAGINAS/disclaimer.aspx
