Lance, Columns of U are in some contexts called "latent factors". For example, if we are applying SVD over a Document(User)-Term(Items) matrix, Columns of U could be interpreted as a representation of groups of terms (words that have similar meaning or tend to appear together in documents of the same kind, so in this case this "latent" factors are "topics" in some way. Another example of this is when we apply the SVD factorization in the famous movie recommendation problem. The "latent" factors (columns of the U matrix) represent somewhat some kind of "movie topics" (Drama, terror, comedy, and possible combinations of these...). Note that if we are trying to make recommendations of movies, we will recommend movies that has a similar topic, i.e. we will recommend probably a whole topic, not an specific movie... but SVD helps us find what movies fall into that topic. Note that this "topic" could be in fact something more abstract than "Drama" or "comedy".
The interpretation of V is more or less the "transpose" of these. In the movie example, the columns of V could be seen as a representation of users that have seen (or rated) the same movie. So if two movies have a similar topic, it has been possible been rated or seen by the same persons, so both movies will have similar values on the V colum representing that group of persons... Actually, Rows of U can be use to find distances between users (according to what the have rated), and rows of Vt can be used to find distances between movies (according to what people have rated them). Last, The values of S are as some other users pointed, can be seen as a "weight" of the importance of this "latent" factors when i'm trying to see the differences between movies or between users. Hope this helps. Please, any other user correct me if you see something wrong in my examples. Best, Fernando. 2010/11/22 Ted Dunning <[email protected]> > Commonly the square root of S is applied to both U and V. S is a set of > importance weightings for the otherwise > normalized columns of U and V. > > On Mon, Nov 22, 2010 at 10:10 AM, Sean Owen <[email protected]> wrote: > > > Hmm. I think I need to fix the second half of my analogy. > > > > It's really U x S that could be said to be users' preferences for > > pseudo-items. and S x VT could be said to be pseudo-users preferences for > > real items. S itself is a diagonal matrix of course and those values are > > kind of like "scaling factors" ... but I actually struggle to come up > with > > a > > good intuitive explanation of what S itself is (or really, U and V by > > themselves). > > > > Anyone smarter have a nice pithy analogy? > > > > On Mon, Nov 22, 2010 at 11:06 AM, Sean Owen <[email protected]> wrote: > > > > > > In more CF-oriented terms, S is an expression of pseudo-users' > > preferences > > > for pseudo-items. And then U expresses how much each real user > > corresponds > > > to each pseudo-user, and likewise for V and items. > > > > > > To put out a speculative analogy -- let's say we're looking at users' > > > preferences for songs. The "pseudo-items" that the SVD comes up with > > might > > > correspond to something like genres, or logical groupings of songs. > > > "Pseudo-users" are something like types of listeners, perhaps > > corresponding > > > to demographics. > > > > > > Whereas an entry in the original matrix makes a statement like "Tommy > > likes > > > the band Filter", an entry in S makes a statement like "Teenage boys in > > > moderately affluent households like industrial metal". And U says how > > much > > > Tommy is part of this demographic, and V tells how much Filter is > > industrial > > > metal. > > > > > > > > >
