On Tue, Dec 8, 2009 at 6:49 PM, Zhengguo 'Mike' SUN <[email protected]>wrote:
> What stochastic decomposition trick are you guys referring to? I appreciate > if you could provide some pointers. > http://arxiv.org/abs/0909.4061 Basically the trick is: first project your column space (or row space, but let's think of the rows as your data points) randomly down from gigantically huge down to some small, but still larger than the final dimension you want to reduce to (say: project from millions of columns down to thousands, where you want to end up with a rank-300 factorization). Then take your num_rows by small_num matrix and square it, leaving you with a small_num by small_num matrix you can decompose in memory using whatever techniques you know to work on small, but dense, matrices. It works because of the Johnson-Lindenstrauss lemma<http://en.wikipedia.org/wiki/Johnson%E2%80%93Lindenstrauss_lemma>. -jake > > > ________________________________ > From: Jake Mannix <[email protected]> > To: [email protected] > Cc: [email protected] > Sent: Tue, December 8, 2009 9:08:19 PM > Subject: Re: Fwd: A MapReduce Algorithm for Matrix Multiplication > > On Tue, Dec 8, 2009 at 5:56 PM, Ted Dunning <[email protected]> wrote: > > > NMF should be amenable to the stochastic decomposition trick. That > reduces > > the problem to a much smaller factorization that you could plausibly do > > using sequential techniques. Jake Mannix is working on getting > > factorizations going, but I don't know if he has gotten to NMF. > > > > I'm not currently working on NMF, but the stochastic decomposition trick > will be > in there soon, which should allow all this pretty easily. > > Although... if you start with a positive matrix, you may want a specialized > random > projector which preserves positivity for this kind of thing. But I'm not > sure, I > haven't looked too closely at what happens when you try to do this trick on > NMF. > > -jake > > > > >
