i take it back about Lanczos.
On Thu, Jul 26, 2012 at 1:30 PM, Dmitriy Lyubimov <[email protected]> wrote: > Oops. I actually don't know that. SSVD is SVD (as much as Lanczos is). > > I guess you need help from the rest of the collective. > > On Thu, Jul 26, 2012 at 1:21 PM, Aniruddha Basak <[email protected]> wrote: >> Actually that's my confusion. I don't need the eigenvectors of AA' >> but of A ! >> If I can find a matrix B such that BB'=A, then from the SVD decomposition of >> B >> we can get the eigenvectors of A. But how to get B in that case ? >> >> >> Aniruddha >> >> >> -----Original Message----- >> From: Dmitriy Lyubimov [mailto:[email protected]] >> Sent: Thursday, July 26, 2012 1:18 PM >> To: [email protected] >> Subject: Re: eigendecomposition of very large matrices >> >> See http://en.wikipedia.org/wiki/Singular_value_decomposition, >> "relation to eigenvalue decomposition". >> >> Depending on what you consider source for the eigendecompostion, AA' >> or A'A, the eigenvectors would be column vectors of U or V respectively. >> >> On Thu, Jul 26, 2012 at 1:12 PM, Aniruddha Basak <[email protected]> >> wrote: >>> Hi, >>> I am trying to use SSVD instead of Lanczos, as a part of Spectral Kmeans. >>> However, I could not find the relation between the eigenvectors and U, V >>> matrices. >>> Can someone please tell me, how to retrieve the eigenvectors from SSVD >>> decomposition ? >>> >>> Thanks, >>> Aniruddha >>> >>> >>> >>> -----Original Message----- >>> From: Dmitriy Lyubimov [mailto:[email protected]] >>> Sent: Thursday, July 19, 2012 10:53 PM >>> To: [email protected] >>> Subject: RE: eigendecomposition of very large matrices >>> >>> Pps if you do insist on having a lot of k then you'll benefit from smaller >>> hdfs block size, not larger. >>> On Jul 19, 2012 10:50 PM, "Dmitriy Lyubimov" <[email protected]> wrote: >>> >>>> Yeah I see OK. Both two experiments conducted with mahout ssvd I am >>>> familiar with dealt with input size greater than yours element wise, >>>> on a quite modest node count. So i don't think your input size will >>>> be a problem. But the number of singular values will be. >>>> >>>> But I doubt any input will yield anything useful beyond k=200 but >>>> statistical noise. Even if you have a good decay of the singular values. >>>> But I bet you don't need that many. You can fit significantly more >>>> 'clusters' on a 'fairly small' dimensional space. >>>> On Jul 19, 2012 6:33 PM, "Aniruddha Basak" <[email protected]> wrote: >>>> >>>>> Thanks Dmitriy for your reply. >>>>> The matrix I am working on, has 10-20 non zero entries per row. So >>>>> its very sparse. >>>>> I am trying to do spectral clustering which involves eigen-decomposition. >>>>> I am wondering whether anyone has tried to do spectral clustering >>>>> using mahout for very large affinity matrix (input). >>>>> >>>>> Aniruddha >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: Dmitriy Lyubimov [mailto:[email protected]] >>>>> Sent: Thursday, July 19, 2012 6:28 PM >>>>> To: [email protected] >>>>> Subject: Re: eigendecomposition of very large matrices >>>>> >>>>> very significant sparsity may be a problem though for -q >=1 parameters. >>>>> Again, depends on the hardware you have and the # of non-zero >>>>> elements in the input. but -q=1 is still the most recommended setting >>>>> here. >>>>> >>>>> >>>>> On Thu, Jul 19, 2012 at 6:20 PM, Dmitriy Lyubimov >>>>> <[email protected]> >>>>> wrote: >>>>> > you may try SSVD. >>>>> > https://cwiki.apache.org/confluence/display/MAHOUT/Stochastic+Sing >>>>> > u >>>>> > lar >>>>> > +Value+Decomposition >>>>> > >>>>> > but 4k eigenvectors (or, rather, singular values) is kind of still >>>>> > a lot though and may push the precision out of the error estimates. >>>>> > I don't we had precision study for that many. Also need quite a >>>>> > bit of memory to compute that (not to mention flops). More >>>>> > realistically you probably may try 1k singular values . You may >>>>> > try more if you have access to more powerful hardware than we did >>>>> > in the studies but distributed computation time will grow at about >>>>> > k^1.5, i.e. faster than linear, even if you have enough nodes for the >>>>> > tasks. >>>>> > >>>>> > -d >>>>> > >>>>> > On Thu, Jul 19, 2012 at 6:12 PM, Aniruddha Basak >>>>> > <[email protected]> >>>>> wrote: >>>>> >> Hi, >>>>> >> I am working on a clustering problem which involves determining >>>>> >> the largest "k" eigenvectors of a very large matrix. The >>>>> >> matrices, I work on, are typically of the order of 10^6 by 10^6. >>>>> >> Trying to do this using the Lanczos solver available in Mahout, I >>>>> >> found it is very slow and takes around 1.5 minutes to compute >>>>> >> each >>>>> eigenvectors. >>>>> >> Hence to get 4000 eigenvectors, it takes 100 hours or 4 days !! >>>>> >> >>>>> >> So I am looking for something faster to solve the "Eigen decomposition" >>>>> >> problem for very large sparse matrix. Please suggest me what >>>>> >> should I >>>>> use ? >>>>> >> >>>>> >> >>>>> >> Thanks, >>>>> >> Aniruddha >>>>> >> >>>>> >>>>
