On 08/06/10 23:47, Jake Mannix wrote:
On Tue, Jun 8, 2010 at 3:20 PM, Sean Owen<[email protected]>  wrote:

Part 2. Compute the SVD
3. Run Lanczos, I'm guessing, on user vectors.

Sounds right at this point.  One important point on this:
DistributedLanczosSolver produces left singular vectors, and the
singular values, but they can be "dirty" - have some duplicates,
have some which are not converged quite enough, not orthogonal
enough, etc.  Thus you should run "EigenVerificationJob" on the
output of that job, and the output of *this* will be "clean" (based
on parameters you set on the job - convergence criteria,
orthogonality, minimum singular value allowed, etc).
EigenVerificationJob will output V, and S. If you want U, then you
can get that by computing userVectors.times(V).times(S), essentially.
This can be done in one map-reduce pass (or two if the transposes
don't line up the right way), by modelling after MatrixMultiplyJob.



How does the EigenVerificationJob represent V and S in the SequenceFile<IntWriteable, VectorWriteable> output? and I guess the same question for the DistributedLanczosSolver.

Reply via email to