On 09/06/10 19:28, Jake Mannix wrote:
On Wed, Jun 9, 2010 at 11:19 AM, Richard Simon Just<
[email protected]>  wrote:

I don't know enough yet to comment on what works best, but I can give some
evidence that they do subtract teh row average ahead of time. Sarwar's
previous work, Application of Dimensionality Reduction piece (
http://www.grouplens.org/papers/pdf/webKDD00.pdf) uses the same prediction
function. In section 4.3.1 Prediction Experiment they discuss the removal of
the row average before the SVD computation and it's later addition for the
prediction. I'd make the assumption that the incremental SVD paper builds on
this.

- Richard

I would be *very* careful on how you decompose a sparse matrix which you
center: if you naively just subtract off the mean from all the entries in
the vectors, an SVD which would have taken 6 hours to compute could suddenly
take weeks, literally.   But if you do the second-from-most-naive thing, and
subtract the means from only the nonzero entries, then all can turn out for
the best.  This is just following Sean's typical advice of "don't treat
unknown preferences as '0.0' ".

   -jake


Agreed. Just wanted to answer the question that had been left hanging for why Sarwar add the row average back. In fact to be complete, before the SVD they fill each null value with 'column average - row average'. But yeah, that would make for a much bigger computation.

-Richard

Reply via email to