And going down teh columns in a sparse matrix could do this to you. On Sun, Jun 26, 2011 at 6:40 PM, Jake Mannix <[email protected]> wrote:
> On Sun, Jun 26, 2011 at 1:47 PM, Vincent Xue <[email protected]> wrote: > > > Hi. > > > > I was wondering how useful an in memory sparse matrix multiplier would > > be for mahout. In my current project I needed to multiply many large > > sparse matrices but submitting hundreds of jobs to the cluster creates > > too much overhead. > > > > I wrote up an implementation of sparse matrix multiplication using > > threads which can multiply a 30,000 x 48,0000 matrix by its transpose > > in about 5 minutes using 16 cores. Granted this matrix is composed > > mostly of 1s, and -1s, (with about 16 elements per row), is this > > considered fast? I have seen that my implementation is much faster > > than iterating though a matrix naively and would like some input to > > whether or not my 5 minute benchmark is by skewed. > > > > about 16 elements per row (and column, I'm guessing?), means if you > did this using the algorithm I usually use for matrix multiplication > (matrix summing the outer products of the columns), then you're > looking at the matrix sum of 48k matrices, each of which only has > 16^2 = 256 entries. So it should take only O(256 * 48k) = 12 million > multiplications and 12 million additions. This seems like it should > be doable in well under a second using this algorithm. > > On the other hand, if you're doing 30k * 30k = 900M dot products, > most of which are zero, but require 16 operations each to discover > this, you'll need closer to 15 billion operations. This would be lot > slower. > > -jake > > > > > > Many thanks for the input, > > Vincent > > >
