svd algorithms

mike bowles Fri, 05 Mar 2010 17:06:05 -0800

I've been trying to figure out how to code an svd algorithm and I've seen
some questions about svd algorithms floating around the Mahout mailing
lists.  I thought it might be helpful to share what I've found so far.


 

Really large matrices require using one of the randomizing methods to get
done.  The methods I've seen reduce one dimension of the problem, but if
singular values and/or left/right singular vectors are required, then an svd
remains to be done.  Here is what I've found as approaches to getting the
svd done.  

 

Lanczos Algorithm - The Lanczos algorithm is covered pretty well in Chapter
9 of (golub and van loan "Matrix Computations").  The Lanczos algorithm
constructs an orthogonal basis set for the Krylov spaces (e.g. the space
spanned by (x, Ax, A^x,..)).  The basic Lanczos algorithm is only suited to
symmetric matrices and has some numerical difficulties maintaining
orthonormality.  The numerical issues are addressed in "Matrix
Computations".  That book also described the Arnoldi algorithm, which is a
derivative of the Lanczos algorithm.  The Arnoldi algorithm works for square
non-symmetric matrices.  The Arnoldi algorithm also has some numerical
issues which are touched upon in "Matrix Computations".  

 

Divide and Conquer - A more current favorite is the "divide and conquer"
algorithm is also described in "matrix computations".  In Section  8.54 (in
the third edition - which is current one), golub shows how to apply a
perturbation to a tridiagonal matrix in order to break it into a block
diagonal matrix with two independent diagonal blocks.  This leads to two
half-sized problems that can be solved independently (e.g. on different
machines).  And of course, why stop at halving the system?  Why not quarter
or keep splitting until the problem is solved or is trivial?  The steps with
"divide and conquer" would be to apply Householder rotations to the starting
matrix to reduce it to tridiagonal and then to split the tridiagonal matrix
into smaller pieces to do the actual eigenvalue calculations.  

 

Non-square matrices - All of the methods I've found for actually producing
singular values or singular vectors are targeted at symmetric matrices or at
most square unsymmetric ones (the arnoldi algorithm).  I haven't seen
anything that attacks the non-square matrix that inevitably arises in latent
semantic analysis (for example).  One approach that comes to mind is the
following.  Suppose we have an mxn matrix M that needs svd'ing.  It's easy
to show that MM^T has the same LHS singular vectors as M.  That is, if M =
UDV^T is the svd of M, then MM^T = (UDV^T) * ( VDU^T) = U (D^2)U^T.  so the
eigenvectors of MM^T are the LHS singular vectors of M.  And the non-zero
eigenvalues of MM^T are the squares of the non-zero singular values of M.
Similarly, the eigenvectors of M^TM are the RHS singular vectors and again
the non-zero eigenvalues of M^TM are the squares of the non-zero singular
values of M.  So an obvious way to use existing svd machinery would be to
calculate the eigenvectors and eigenvalues of MM^T and of M^TM.  Typically
one of these is relatively small (on the order of the number of singular
values in the final approximation squared (or so) or klog(k) for better
randomized methods).  The other dimension is relatively large (e.g. the
number of words or tokens that are being counted).  

 

This is the best I've got so far.  If anyone has better, I'd really love to
hear about it.  

 

Mike

svd algorithms

Reply via email to