The 0 eigenvalue output is not valid, and yes, the output will list the results in *increasing* order, even though it is finding the largest eigenvalues/vectors first.
Remember that convergence is gradual, so if you only ask for 3 eigevectors/values, you won't be very accurate. If you ask for 10 or more, the largest few will now be quite good. If you ask for 50, now the top 10-20 will be *extremely* accurate, and maybe the top 30 will still be quite good. Try out a non-distributed form of what is in the EigenverificationJob to re-order the output and collect how accurate your results are (it computes errors for you as well). -jake 2011/6/23 <[email protected]> > So, I know that MAHOUT-369 fixed a bug with the distributed version of the > LanczosSolver but I am experiencing a similar problem with the > non-distributed version. > > I send a dataset of gaussian distributed numbers (testing PCA stuff) and > my eigenvalues are seemingly reversed. Below I have the output given in > the logs from LanczosSolver. > > Output: > INFO: Eigenvector 0 found with eigenvalue 0.0 > INFO: Eigenvector 1 found with eigenvalue 347.8703086831804 > INFO: LanczosSolver finished. > > So it returns a vector with eigenvalue 0 before one with an eigenvalue of > 347?. Whats more interesting is that when I increase the rank, I get a new > eigenvector with a value between 0 and 347: > > INFO: Eigenvector 0 found with eigenvalue 0.0 > INFO: Eigenvector 1 found with eigenvalue 44.794928654801566 > INFO: Eigenvector 2 found with eigenvalue 347.8286920203704 > > Shouldn't the eigenvalues be in descending order? Also is the 0.0 > eigenvalue even valid? > > Thanks, > Trevor > >
