Hi Jake, Thanks a lot for your reply.
I just have another question, the LanzcosSolver documentation says: To avoid floating point overflow problems which arise in power-methods like Lanczos, an initial pass is made through the input matrix to generate a good starting seed vector by summing all the rows of the input matrix, andcompute the trace(inputMatrixt*matrix) This latter value, being the sum of all of the singular values, is used to rescale the entire matrix, effectively forcing the largest singular value to be strictly less than one, and transforming floating point overflow problems into floating point underflow (ie, very small singular values will become invisible, as they will appear to be zero and the algorithm will terminate). And because of that all the eigenValues I get are less than 0. Is it possible to get te real values? K.R. Pedro. ---------------------------------------- > From: [email protected] > Date: Mon, 22 Nov 2010 14:01:31 -0800 > Subject: Re: Lanczos Algorithm > To: [email protected] > > Hi Pedro, > > I wrote the Lanczos stuff in Mahout, but I've been swamped with my new job > the past few months, and haven't been able to interact on the list much, but > I've got a minute or two to try and answer your questions. > > First: the LanczosSolver does indeed leave you with eigenvalues in the > opposite order than you expect (ascending order, instead of descending). It > should be noted that in general, the Lanczos method *does not* give you the > "largest N eigenvalues" - if you ask for a rank-N decomposition via Lanczos, > you'll get the smallest eigenvalue for sure, then somewhat less than N/2 of > the other small eigenvalues (with the last few being right in the middle of > the spectrum), then somewhat less than N/2 of the largest eigenvalues (with > the first few being in the middle of the spectrum), ending with the largest > eigenvalue. > > If you really want the top N eigenvalues, then you should ask for somewhere > between 3N and 4N rank decomposition, and take the last N of these. You > won't be guaranteed to get *exactly all* of the top N eigenvalues, > especially if you have a very rapidly decreasing eigenvalue spectrum (as is > usually the case with real data), and you reach the "plateau" of middling > values before N. In this case, if there are K > get all K of these, and then N-K's worth of a sampling of the middling > eigenvalues. > > One other thing you need to remember about the LanczosSolver: if your input > matrix is symmetric, pass in the boolean isSymmetric=true to the solve() > method, or else you'll get wrong values. > > -jake > > On Mon, Nov 22, 2010 at 1:07 PM, PEDRO MANUEL JIMENEZ RODRIGUEZ < > [email protected]> wrote: > > > > > Hi Ted, > > > > I can't give you an exact amount but more or less it could be around 10^5 > > non-zero elements per row. > > > > Could you please let me know, why the lanzcos algorithm is not always > > returning the values in a decreasing order? > > > > Thanks. > > > > Pedro. > > > > ---------------------------------------- > > > From: [email protected] > > > Date: Fri, 19 Nov 2010 13:34:19 -0800 > > > Subject: Re: Lanczos Algorithm > > > To: [email protected] > > > > > > How many non-zero elements? > > > > > > On Fri, Nov 19, 2010 at 12:34 PM, PEDRO MANUEL JIMENEZ RODRIGUEZ < > > > [email protected]> wrote: > > > > > > > > > > > > > > > I was talking about 10^9 rows and 10^9 columns > > > > > > > > ---------------------------------------- > > > > > From: [email protected] > > > > > Date: Fri, 19 Nov 2010 12:07:16 -0800 > > > > > Subject: Re: Lanczos Algorithm > > > > > To: [email protected] > > > > > > > > > > On Fri, Nov 19, 2010 at 11:17 AM, PEDRO MANUEL JIMENEZ RODRIGUEZ < > > > > > [email protected]> wrote: > > > > > > > > > > > In this project I would have to work with matrix of 10^9, which > > have a > > > > very > > > > > > sparse data. > > > > > > > > > > > > > > > I think you mean 10^9 rows and 10^9 columns with much fewer 10^18 > > > > non-zero > > > > > elements. > > > > > > > > > > Is that correct? > > > > > > > > > > Can you say how many non-zero elements? > > > > > > > > > > > >
