This is a theme that is going to raise itself over and over. I think that strategically, Mahout is going to have to embrace the MapReduce nextGen work so that we can have flexible computation models. We already need this with all the large scale SVD work. We could very much use it for the SGD stuff. Now this gradient work could use it.
New needs aren't going to stop. On Tue, May 17, 2011 at 10:17 PM, Hector Yee <[email protected]> wrote: > Re: boosting scalability, I've implemented it on thousands of machines, but > not with mapreduce, rather with direct RPC calls. The gradient computation > tends to be iterative, so one way to do it is to have each iteration run > per > mapreduce. > Compute gradients in the mapper, gather them in the reducer, rinse and > repeat. >
