OK, so I have jobs #1 and #2 mostly done. The results are in github at:

https://github.com/jtraupman/mahout

under the mahout-672 branch.

I'm running into some trouble with a couple of the Lanczos test cases:

TestLanczosSolver.testEigenvalueCheck() wouldn't pass until I reduced the
fraction of expected good eigenvectors from 0.75 to 0.625. I believe this is
because the original version of the test was running Lanczos solver in
asymmetric mode even though the input matrix is symmetric and the colt
solver runs in symmetric mode.

TestDistributedLanczosSolverCLI. testDistributedLanczosSolverEVJCLI() won't
pass and I'm not sure why. The eigenvalues/vectors from the rank 20 run
don't seem to be the same as the ones we get back from the rank 10 run. Not
sure what's going on here -- I'll continue to look into it more tomorrow. It
might have something to do with the fact that SimpleEigenVerifier uses
times() instead of timesSquared() now -- I'm relying on the caller to
construct a SquaredLinearOperator from the input for the asymmetric case.

I'll keep looking into these issues if one of you guys don't get to it
first. After this is done, it's just a matter of integrating the LSMR and CG
code into this patch, which should be straightforward.

-Jon


On Mon, May 2, 2011 at 10:37 AM, Jake Mannix <[email protected]> wrote:

> On Mon, May 2, 2011 at 12:13 AM, Jonathan Traupman
> <[email protected]>wrote:
>
> > It's coming along. I have the LinearOperator stuff done, and refactored
> the
> > Matrix class/interface hierarchy to inherit/implement from
> LinearOperator.
> > I
> > also have refactored Lanczos solver to take a LinearOperator instead of a
> > VectorIterable. I made a small change to the Lanczos solver API for this:
> > the isSymmetric flag is now on the Lanczos state instead of the solver,
> and
> > the state does the sqrt operation to rescale the singular values.
> >
> > Still to do:
> >
> > 1. get mahout-math back into a stable state (compile cleanly and make
> > whatever test changes are necessary for this)
> > 2. refactor DisributedRowMatrix and DistributedLanczos solver to use
> > LinearOperators
> > 3. refactor my CG solver and Ted's LSMR solver to use LinearOperators
> >
> > I didn't get much opportunity to work on it this weekend because my wife
> > was
> > monopolizing the computer for a paper she has due tomorrow. I'm hoping to
> > get at least #1 done tomorrow, at which point I'll push it out to github
> > for
> > you guys to check out.
> >
>
> Getting #1 in and pushed to GitHub would be awesome - don't try to bite off
> too much, the codebase moves fast, so you'll avoid getting out of sync if
> we can get this stuff reviewed earlier rather than later, especially for
> fairly invasive operations like this.
>
> I'm going to have committing a bunch of the current stuff this week, so
> if your stuff is on GitHub, I can pull in parts of it, if appropriate (ie
> hopefully
> both 1+2).  We may have to resolve some conflicts as we go, as I'm working
> in the LanczosSolver and DistributedLanczosSolver code as well, but I'll
> make sure those changes get pushed to my GitHub branches so we can
> easily resolve them.
>
>  -jake
>
>
> >
> > -Jon
> >
> >
> > On Sun, May 1, 2011 at 9:01 PM, Jake Mannix <[email protected]>
> wrote:
> >
> > > Hey Jonathan,
> > >
> > >  How's the progress on MAHOUT-672?  Do you have your LinearOperator
> stuff
> > > on Github or anywhere else?  If not, I can pick up and run with it.
> > >
> > >  -jake
> > >
> > > On Thu, Apr 28, 2011 at 12:32 AM, Jonathan Traupman
> > > <[email protected]>wrote:
> > >
> > > > I'm working on the LinearOperator stuff for MAHOUT-672 and have
> gotten
> > to
> > > > the point when I'm modifying the new LanczosSolver as implemented in
> > the
> > > > MAHOUT-319 patch applied to Jake's github repo.
> > > >
> > > > One quick question: on lines 165-167 of LanczosSolver.java we have
> the
> > > > lines:
> > > >
> > > > if (isSymmetric) {
> > > >  e = Math.sqrt(e);
> > > > }
> > > >
> > > > Unless I'm misunderstanding something, isn't this backwards? My
> > > > understanding is that Lanczos is an eigendecomposition algorithm, so
> > for
> > > a
> > > > symmetric A, it's going to compute eigenvector matrix U and diagonal
> > > > eigenvalue matrix S such that A ~= USU'. To use it to compute the
> SVD,
> > > you
> > > > use the fact that for non-symmetric A = USV', we have A'A = V(S^2)V',
> > so
> > > by
> > > > taking the eigendecomposition of A'A, you get the right singular
> > vectors
> > > as
> > > > the eigenvectors and the singular values as the sqrt of the
> > eigenvalues.
> > > > So,
> > > > shouldn't these lines be:
> > > >
> > > > if (!isSymmetric) {
> > > >  e = Math.sqrt(e);
> > > > }
> > > >
> > > > -Jon
> > > >
> > >
> >
>

Reply via email to