On Wed, Oct 31, 2012 at 8:21 AM, Perko, Ralph J <[email protected]> wrote: > I am using the mahout pca function to project a set of documents into 2-d > space. From what I understand, the pca function in mahout generates a [U] > matrix using a
Yes, more specifically, you probably need U*Sigma output (-us true option) [V] matrix for translation from the high-dimensional data representation to the 2-d representation. Is there a way that I can specify the [V] matrix if I already have one I would like for it to use? if you already have V, you don't need to run SVD. You need a result of multiplication (A-M)V which is very closely same as U*Sigma output. This is something else than what SSVD does. "SSVD --pca true" finds SVD of A-M (or A-Xi in the docs) which is a more complicated task than just (A-M)V You can compute (A-M) V with help of DistributedMatrix matrix operations. There's column mean computation and matrix multiplication. (not immediately sure if multiplication with vector subtraction can be combined in one run). > > Thanks, > Ralph > __________________________________________________ > Ralph Perko > Pacific Northwest National Laboratory > >
