Hi Dmitriy,

  Stochastic SVD is high on my list of pieces to get into Mahout as
well, but is partly dependent on getting some of Ted's murmurhash stuff
from the SGD work he's got sitting idle in a patch on MAHOUT-228.

  If you could help get MAHOUT-228 finished and put in trunk, we could
quickly move forward on MAHOUT-309.  I think this can be done in
possibly only 2 MR passes, but we can chat about that a bit more
as we dig into it. :)

  -jake

On Mon, Mar 22, 2010 at 4:33 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:

> Hi all,
>
> i had a chance to touch base quickly with Ted Dunning last weekend at the
> Bay Area machine learning camp. It's my understanding the main advantage of
> this method is that partial SVD can be achieved in a constant # of MR jobs
> (Ted's analysis seemed to imply that number would be 4) .
>
> 've been following Mahout for perhaps couple of  months and read the book
> (first 6 chapters of it anyway) in MEA, and that's about it. But i have a
> great interest in all the work happening in this project.
>
>
> While it my be the case that our particular business problem at the time
> may
> be addressed by running single-node iterative svd (such as lanczos
> iterative, one of lapack's methods), it is highly likely it will not be the
> case for too long. We also use Hadoop and ecosystem for our platform, so
> mahout comes naturally into picture (whereas MPI does not).
>
> Anyway, starting the next week, i will have to spend time on that business
> need, and my boss seems to be happy if i have a chance to contribute part
> of
> my time and results to Mahout (i guess he also expects results as well...
> eventually :-) ) . The paper seems to be the one in the issue MAHOUT-309, i
> skimmed it a little bit and i guess i have some questions in regards to
> Ted's clarifications as given at the camp this weekend and this paper (if
> it
> is even the right one).
>
> I guess i do need some guidance if i am to do this and i am wondering if my
> effort is welcome (provided i need some guidance on some details of Mahout
> and the algorithms there). I guess my selfish desire is to escalate method
> availability in Mahout.
>
> Thank you very much.
> -Dmitriy
>

Reply via email to