Please look at this survey [1]. I am currently reading thru it. Quite comprehensive.
Where it fits us: ssvd, dssvd depend on QR accuracy quite a bit. What we have right now in "new" algebra DSL is shown there as algorithm 9, "Cholesky QR". The benefit is that it is extremely easy and parallelizable. The problem with that guy is that it is quite a bit more unstable compared to Householder's or Given's (you can test it even with non-distributed matrices, the difference quite apparent, especially in cases that start apporaching singularity). What we had in MR version was something close to what they describe as TSQR. Our approach is detailed in Nathan Halko's dissertation (reference is given on SSVD page). Our "older" MR version is much better in accuracy. It is different from TSQR in quite a few things. First, it is using reordered Given's QR for both reductions and blocking QR. using givens QR for blocks allows for sequential processing of input. Householder requires pretty much entire matrix loaded first. Second, it develops algorithm for N-way merges instead of binary merges as defined by TSQR. I haven't finish reading implementation details on TSQR, but basically what it seems to mean is that TSQR requires dynamic scheduling for merge tasks to stay at maximum parallelization efficiency. N-way formulation allows to run more than 2 tasks before each reduction step, hence, the depth of the tree could be much more shallow (log base N instead of log base 2) of number of blocks. There's also a CAQR algorithm there which i haven't even touched. I think it is worth to read thru that. Bottom line, what i think is that (1) we need to implement a well parallelized variation of one of TSQR, CAQR or Mahout MR N-way Givens QR as a more numerically stable alternative to CholeskyQR (aka "thinQR" in the sparkbindings algebra); (2) I think some of approaches in CAQR must lead to parallelization of non-thin decompositions; (3) they also consider approaches to distributed LU stuff which can open a door to solving "non-thin" Ridge regressions and such. Looking for takers and opinions on CAQR and Mahout's n-way Givens for the sparkbindings. [1] http://arxiv.org/pdf/0806.2159.pdf
