Stochastic SVD

Dmitriy Lyubimov Mon, 22 Mar 2010 16:34:24 -0700

Hi all,

i had a chance to touch base quickly with Ted Dunning last weekend at the
Bay Area machine learning camp. It's my understanding the main advantage of
this method is that partial SVD can be achieved in a constant # of MR jobs
(Ted's analysis seemed to imply that number would be 4) .


've been following Mahout for perhaps couple of  months and read the book
(first 6 chapters of it anyway) in MEA, and that's about it. But i have a
great interest in all the work happening in this project.


While it my be the case that our particular business problem at the time may
be addressed by running single-node iterative svd (such as lanczos
iterative, one of lapack's methods), it is highly likely it will not be the
case for too long. We also use Hadoop and ecosystem for our platform, so
mahout comes naturally into picture (whereas MPI does not).

Anyway, starting the next week, i will have to spend time on that business
need, and my boss seems to be happy if i have a chance to contribute part of
my time and results to Mahout (i guess he also expects results as well...
eventually :-) ) . The paper seems to be the one in the issue MAHOUT-309, i
skimmed it a little bit and i guess i have some questions in regards to
Ted's clarifications as given at the camp this weekend and this paper (if it
is even the right one).

I guess i do need some guidance if i am to do this and i am wondering if my
effort is welcome (provided i need some guidance on some details of Mahout
and the algorithms there). I guess my selfish desire is to escalate method
availability in Mahout.

Thank you very much.
-Dmitriy

Stochastic SVD

Reply via email to