The problem is that this only approximates the singular values to the degree
that R'R = I.

In experimenting, I found significant errors with that approach.

On Mon, Apr 12, 2010 at 10:21 AM, Jake Mannix <jake.man...@gmail.com> wrote:

> I did read that quickly this morning, and I'm not sure exactly what the
> gain of the blockwise QR is... what is wrong with just taking each row of
> "x" of A*R, producing  P = x.cross(x), and emitting each of the rows of P,
> keyed on their row number, from the mapper.  Reducers (and combiners!) just
> emit (inputKey, vector_sum(values)). If you're looking for a rank k
> decomposition, you'll have O(k') different reduce keys (with k' being
> slightly larger than k), and you're efficiently parallelized in both the map
> and reduce steps, leaving the output being an HDFS SequenceFile of the rows
> of R'A'AR.  Load up all of these rows in memory (it's only a (k' by
> k')-matrix), and SVD.  This SVD loaded in memory of all of the mappers of
> another M/R pass can then be used over A*R to get the left singular values.

Reply via email to