See http://en.wikipedia.org/wiki/Singular_value_decomposition, "relation to eigenvalue decomposition".
Depending on what you consider source for the eigendecompostion, AA' or A'A, the eigenvectors would be column vectors of U or V respectively. On Thu, Jul 26, 2012 at 1:12 PM, Aniruddha Basak <[email protected]> wrote: > Hi, > I am trying to use SSVD instead of Lanczos, as a part of Spectral Kmeans. > However, I could not find the relation between the eigenvectors and U, V > matrices. > Can someone please tell me, how to retrieve the eigenvectors from SSVD > decomposition ? > > Thanks, > Aniruddha > > > > -----Original Message----- > From: Dmitriy Lyubimov [mailto:[email protected]] > Sent: Thursday, July 19, 2012 10:53 PM > To: [email protected] > Subject: RE: eigendecomposition of very large matrices > > Pps if you do insist on having a lot of k then you'll benefit from smaller > hdfs block size, not larger. > On Jul 19, 2012 10:50 PM, "Dmitriy Lyubimov" <[email protected]> wrote: > >> Yeah I see OK. Both two experiments conducted with mahout ssvd I am >> familiar with dealt with input size greater than yours element wise, >> on a quite modest node count. So i don't think your input size will be >> a problem. But the number of singular values will be. >> >> But I doubt any input will yield anything useful beyond k=200 but >> statistical noise. Even if you have a good decay of the singular values. >> But I bet you don't need that many. You can fit significantly more >> 'clusters' on a 'fairly small' dimensional space. >> On Jul 19, 2012 6:33 PM, "Aniruddha Basak" <[email protected]> wrote: >> >>> Thanks Dmitriy for your reply. >>> The matrix I am working on, has 10-20 non zero entries per row. So >>> its very sparse. >>> I am trying to do spectral clustering which involves eigen-decomposition. >>> I am wondering whether anyone has tried to do spectral clustering >>> using mahout for very large affinity matrix (input). >>> >>> Aniruddha >>> >>> >>> -----Original Message----- >>> From: Dmitriy Lyubimov [mailto:[email protected]] >>> Sent: Thursday, July 19, 2012 6:28 PM >>> To: [email protected] >>> Subject: Re: eigendecomposition of very large matrices >>> >>> very significant sparsity may be a problem though for -q >=1 parameters. >>> Again, depends on the hardware you have and the # of non-zero >>> elements in the input. but -q=1 is still the most recommended setting here. >>> >>> >>> On Thu, Jul 19, 2012 at 6:20 PM, Dmitriy Lyubimov <[email protected]> >>> wrote: >>> > you may try SSVD. >>> > https://cwiki.apache.org/confluence/display/MAHOUT/Stochastic+Singu >>> > lar >>> > +Value+Decomposition >>> > >>> > but 4k eigenvectors (or, rather, singular values) is kind of still >>> > a lot though and may push the precision out of the error estimates. >>> > I don't we had precision study for that many. Also need quite a bit >>> > of memory to compute that (not to mention flops). More >>> > realistically you probably may try 1k singular values . You may try >>> > more if you have access to more powerful hardware than we did in >>> > the studies but distributed computation time will grow at about >>> > k^1.5, i.e. faster than linear, even if you have enough nodes for the >>> > tasks. >>> > >>> > -d >>> > >>> > On Thu, Jul 19, 2012 at 6:12 PM, Aniruddha Basak >>> > <[email protected]> >>> wrote: >>> >> Hi, >>> >> I am working on a clustering problem which involves determining >>> >> the largest "k" eigenvectors of a very large matrix. The matrices, >>> >> I work on, are typically of the order of 10^6 by 10^6. >>> >> Trying to do this using the Lanczos solver available in Mahout, I >>> >> found it is very slow and takes around 1.5 minutes to compute each >>> eigenvectors. >>> >> Hence to get 4000 eigenvectors, it takes 100 hours or 4 days !! >>> >> >>> >> So I am looking for something faster to solve the "Eigen decomposition" >>> >> problem for very large sparse matrix. Please suggest me what >>> >> should I >>> use ? >>> >> >>> >> >>> >> Thanks, >>> >> Aniruddha >>> >> >>> >>
