Hi Eshwaran! Can you please try with rank=4 and let me know what do you get? If I recall correctly the requested rank should be 4, and then you get 3 eigenvalues. Take a look at: http://bickson.blogspot.com/2011/02/some-thoughts-about-accuracy-of-mahouts.html
Regarding sign changes, I remember seeing this as well... The best way to debug is to download the Lanczos code I wrote in Matlab: http://www.cs.cmu.edu/~bickson/gabp/#download and then run iteration by iteration in the debugger. Instruction for setting the debugging environment in Eclipse are found here: http://bickson.blogspot.com/2011/02/hadoopmahout-setting-up-development.html Best, DB On Tue, Jun 14, 2011 at 5:02 PM, Eshwaran Vijaya Kumar < evijayaku...@mozilla.com> wrote: > Hello all, > I am trying to compare the Mahout (0.5 RELEASE) Lanczos Solver Results > with Matlab and am having issues satisfying myself regarding the > correctness of Mahout's output. I would appreciate some clarification from > some one who was looked at the code for a longer period of time than I have. > I used a similar code to what Danny has done here ( > https://issues.apache.org/jira/browse/MAHOUT-369 ) > > > I added to TestLanczosSolver the following code: > > > @Test > public void testLanczosSolver2() throws Exception { > int numRows = 3; int numCols = 3; > SparseRowMatrix m = new SparseRowMatrix(new int[]{numRows, numCols}); > /** > * > * 3.1200 -3.1212 -3.0000 > * -3.1110 1.5000 2.1212 > * -7.0000 -8.0000 -4.0000 > * > * */ > m.set(0,0,3.12); > m.set(0,1,-3.12121); > m.set(0,2,-3); > m.set(1,0,-3.111); > m.set(1,1,1.5); > m.set(1,2,2.12122); > m.set(2,0,-7); > m.set(2,1,-8); > m.set(2,2,-4); > > int rank = 3; > System.out.println("******** Starting Eshwaran's Tests *************"); > Vector initialVector = new DenseVector(numCols); > initialVector.assign(1d / Math.sqrt(numCols)); > LanczosState state = new LanczosState(m, numCols, rank, initialVector); > long time = timeLanczos(m, state, rank, false); > assertTrue("Lanczos taking too long! Are you in the debugger? ", time < > 10000); > //assertOrthonormal(eigens); > ////assertEigen(eigens, m, 0.1, false); > } > > Note that I had to slightly modify Danny's test code to get it working with > the (latest ?) Mahout API. > > > I printed out the value of realEigen in LanczosSolver.java. I also > commented the normalization step ( //nextVector.assign(new Scale(1.0 / > state.getScaleFactor())); > > ) as was recommended in that discussion. > > My output: > > ******** Starting Eshwaran's Tests -I ************* > Jun 14, 2011 1:54:12 PM org.slf4j.impl.JCLLoggerAdapter info > INFO: Finding 3 singular vectors of matrix with 3 rows, via Lanczos > Jun 14, 2011 1:54:12 PM org.slf4j.impl.JCLLoggerAdapter info > INFO: 1 passes through the corpus so far... > Jun 14, 2011 1:54:12 PM org.slf4j.impl.JCLLoggerAdapter info > INFO: 2 passes through the corpus so far... > Jun 14, 2011 1:54:12 PM org.slf4j.impl.JCLLoggerAdapter info > INFO: Lanczos iteration complete - now to diagonalize the tri-diagonal > auxiliary matrix. > Jun 14, 2011 1:54:12 PM org.slf4j.impl.JCLLoggerAdapter info > INFO: Eigenvector [0.5536042338073482, 0.7356862573677923, > 0.39024105759229055] found with eigenvalue 0.0 > Jun 14, 2011 1:54:12 PM org.slf4j.impl.JCLLoggerAdapter info > INFO: Eigenvector [0.16585402589950515, -0.5566129719645326, > 0.8140481813343339] found with eigenvalue 4.755295040050496 > Jun 14, 2011 1:54:12 PM org.slf4j.impl.JCLLoggerAdapter info > INFO: Eigenvector [0.8160972946919413, -0.38593746923689726, > -0.43015982545504394] found with eigenvalue 129.2456107625402 > Jun 14, 2011 1:54:12 PM org.slf4j.impl.JCLLoggerAdapter info > INFO: LanczosSolver finished. > > > > > Comparing with Matlab. > > A = > > 3.1200 -3.1212 -3.0000 > -3.1110 1.5000 2.1212 > -7.0000 -8.0000 -4.0000 > > > [a,b] = eig(A'*A) > > a = > > 0.2132 -0.8010 -0.5593 > -0.5785 0.3578 -0.7330 > 0.7873 0.4799 -0.3871 > > > b = > > 0.0314 0 0 > 0 42.6175 0 > 0 0 131.2552 > > > > > Note that only one of the Eigen Values matches. Uncommenting out the > normalization step obviously ensured that nothing matched. Furthermore, > there are sign changes in the eigen vectors and they don't appear to be > correctly matched up. For example, the eigen vector corresponding to value > (131) in Mahout's case is [0.8160972946919413, -0.38593746923689726, > -0.43015982545504394] which as you can see from the Matlab output is the > Eigen Vector associated with 42.61. > > > Can someone clarifying what I am missing here ? > > Thanks in advance > Eshwaran > > > > > > > > > > > > >