Oops. I actually don't know that. SSVD is SVD (as much as Lanczos is). I guess you need help from the rest of the collective.
On Thu, Jul 26, 2012 at 1:21 PM, Aniruddha Basak <[email protected]> wrote: > Actually that's my confusion. I don't need the eigenvectors of AA' > but of A ! > If I can find a matrix B such that BB'=A, then from the SVD decomposition of B > we can get the eigenvectors of A. But how to get B in that case ? > > > Aniruddha > > > -----Original Message----- > From: Dmitriy Lyubimov [mailto:[email protected]] > Sent: Thursday, July 26, 2012 1:18 PM > To: [email protected] > Subject: Re: eigendecomposition of very large matrices > > See http://en.wikipedia.org/wiki/Singular_value_decomposition, > "relation to eigenvalue decomposition". > > Depending on what you consider source for the eigendecompostion, AA' > or A'A, the eigenvectors would be column vectors of U or V respectively. > > On Thu, Jul 26, 2012 at 1:12 PM, Aniruddha Basak <[email protected]> wrote: >> Hi, >> I am trying to use SSVD instead of Lanczos, as a part of Spectral Kmeans. >> However, I could not find the relation between the eigenvectors and U, V >> matrices. >> Can someone please tell me, how to retrieve the eigenvectors from SSVD >> decomposition ? >> >> Thanks, >> Aniruddha >> >> >> >> -----Original Message----- >> From: Dmitriy Lyubimov [mailto:[email protected]] >> Sent: Thursday, July 19, 2012 10:53 PM >> To: [email protected] >> Subject: RE: eigendecomposition of very large matrices >> >> Pps if you do insist on having a lot of k then you'll benefit from smaller >> hdfs block size, not larger. >> On Jul 19, 2012 10:50 PM, "Dmitriy Lyubimov" <[email protected]> wrote: >> >>> Yeah I see OK. Both two experiments conducted with mahout ssvd I am >>> familiar with dealt with input size greater than yours element wise, >>> on a quite modest node count. So i don't think your input size will >>> be a problem. But the number of singular values will be. >>> >>> But I doubt any input will yield anything useful beyond k=200 but >>> statistical noise. Even if you have a good decay of the singular values. >>> But I bet you don't need that many. You can fit significantly more >>> 'clusters' on a 'fairly small' dimensional space. >>> On Jul 19, 2012 6:33 PM, "Aniruddha Basak" <[email protected]> wrote: >>> >>>> Thanks Dmitriy for your reply. >>>> The matrix I am working on, has 10-20 non zero entries per row. So >>>> its very sparse. >>>> I am trying to do spectral clustering which involves eigen-decomposition. >>>> I am wondering whether anyone has tried to do spectral clustering >>>> using mahout for very large affinity matrix (input). >>>> >>>> Aniruddha >>>> >>>> >>>> -----Original Message----- >>>> From: Dmitriy Lyubimov [mailto:[email protected]] >>>> Sent: Thursday, July 19, 2012 6:28 PM >>>> To: [email protected] >>>> Subject: Re: eigendecomposition of very large matrices >>>> >>>> very significant sparsity may be a problem though for -q >=1 parameters. >>>> Again, depends on the hardware you have and the # of non-zero >>>> elements in the input. but -q=1 is still the most recommended setting here. >>>> >>>> >>>> On Thu, Jul 19, 2012 at 6:20 PM, Dmitriy Lyubimov >>>> <[email protected]> >>>> wrote: >>>> > you may try SSVD. >>>> > https://cwiki.apache.org/confluence/display/MAHOUT/Stochastic+Sing >>>> > u >>>> > lar >>>> > +Value+Decomposition >>>> > >>>> > but 4k eigenvectors (or, rather, singular values) is kind of still >>>> > a lot though and may push the precision out of the error estimates. >>>> > I don't we had precision study for that many. Also need quite a >>>> > bit of memory to compute that (not to mention flops). More >>>> > realistically you probably may try 1k singular values . You may >>>> > try more if you have access to more powerful hardware than we did >>>> > in the studies but distributed computation time will grow at about >>>> > k^1.5, i.e. faster than linear, even if you have enough nodes for the >>>> > tasks. >>>> > >>>> > -d >>>> > >>>> > On Thu, Jul 19, 2012 at 6:12 PM, Aniruddha Basak >>>> > <[email protected]> >>>> wrote: >>>> >> Hi, >>>> >> I am working on a clustering problem which involves determining >>>> >> the largest "k" eigenvectors of a very large matrix. The >>>> >> matrices, I work on, are typically of the order of 10^6 by 10^6. >>>> >> Trying to do this using the Lanczos solver available in Mahout, I >>>> >> found it is very slow and takes around 1.5 minutes to compute >>>> >> each >>>> eigenvectors. >>>> >> Hence to get 4000 eigenvectors, it takes 100 hours or 4 days !! >>>> >> >>>> >> So I am looking for something faster to solve the "Eigen decomposition" >>>> >> problem for very large sparse matrix. Please suggest me what >>>> >> should I >>>> use ? >>>> >> >>>> >> >>>> >> Thanks, >>>> >> Aniruddha >>>> >> >>>> >>>
