The 0 eigenvalue output is not valid, and yes, the output will list the
results
in *increasing* order, even though it is finding the largest
eigenvalues/vectors
first.

Remember that convergence is gradual, so if you only ask for 3
eigevectors/values, you won't be very accurate.  If you ask for 10 or more,
the
largest few will now be quite good.  If you ask for 50, now the top 10-20
will
be *extremely* accurate, and maybe the top 30 will still be quite good.

Try out a non-distributed form of what is in the EigenverificationJob to
re-order the output and collect how accurate your results are (it computes
errors for you as well).

  -jake

2011/6/23 <[email protected]>

> So, I know that MAHOUT-369 fixed a bug with the distributed version of the
> LanczosSolver but I am experiencing a similar problem with the
> non-distributed version.
>
> I send a dataset of gaussian distributed numbers (testing PCA stuff) and
> my eigenvalues are seemingly reversed. Below I have the output given in
> the logs from LanczosSolver.
>
> Output:
> INFO: Eigenvector 0 found with eigenvalue 0.0
> INFO: Eigenvector 1 found with eigenvalue 347.8703086831804
> INFO: LanczosSolver finished.
>
> So it returns a vector with eigenvalue 0 before one with an eigenvalue of
> 347?. Whats more interesting is that when I increase the rank, I get a new
> eigenvector with a value between 0 and 347:
>
> INFO: Eigenvector 0 found with eigenvalue 0.0
> INFO: Eigenvector 1 found with eigenvalue 44.794928654801566
> INFO: Eigenvector 2 found with eigenvalue 347.8286920203704
>
> Shouldn't the eigenvalues be in descending order? Also is the 0.0
> eigenvalue even valid?
>
> Thanks,
> Trevor
>
>

Reply via email to