Re: Current state of (dense) matrix multiplication?

Vimal Mathew Mon, 12 Apr 2010 04:16:21 -0700

The naive matrix-multiplication algorithm is highly parallelizable if
you have the data available locally at all the nodes. The persistent
storage issue was one of the first problems that I tried solving (HDFS
is just wrong for the access patterns in matrix algorithms).


I cant compete with Matlab yet! But I am planning to add support for
SSE2 instructions, so I might get close. Also I dont have systems with
64G RAM, or 14 cores at one place :(
I hope to get much better results in a month or two.


On Mon, Apr 12, 2010 at 12:27 AM, Steven Buss <steven.b...@gmail.com> wrote:
> If you're just doing matrix multiplication, I would advise that mahout
> (or any mapreduce approach) isn't well suited to your problem. I did
> the same computation with matlab (multiplying two 40k x 40k random
> double precision dense matrices) using 14 cores and about 36GB of ram
> on a single machine* and it finished in about 55 minutes. If I'm
> reading your email correctly, you were working with 34*2*4=272 cores!
> I'm not sure if dense matrix multiplication can actually be
> efficiently mapreduced, but I am still a rookie so don't take my word
> for it.
>
> *The machine I am working on has 8 dual core AMD opteron 875s @ 2.2GHz
> per core, with 64GB total system memory.
>
> Steven Buss
> steven.b...@gmail.com
> http://www.stevenbuss.com/
>
>
>
> On Sun, Apr 11, 2010 at 11:53 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
>> Vimal,
>>
>> We don't have any distributed dense multiplication operations because we
>> have not yet found much application demand for distributed dense matrix
>> multiplication.  Distributed sparse matrix operations are a big deal,
>> however.
>>
>> If you are interested in working on the problem in the context of Mahout, we
>> would love to help.  This is especially true if you have an application that
>> needs dense operations and could benefit from some of the other capabilities
>> in Mahout.
>>
>> On Sun, Apr 11, 2010 at 1:27 PM, Vimal Mathew <vml.mat...@gmail.com> wrote:
>>
>>> Hi,
>>>  What's the current state of matrix-matrix multiplication in Mahout?
>>> Are there any performance results available for large matrices?
>>>
>>>  I have been working on a Hadoop-compatible distributed storage for
>>> matrices. I can currently multiply two 40K x 40K dense double
>>> precision matrices in around 1 hour using 34 systems (16GB RAM, two
>>> Core2Quads' per node). I was wondering how this compares with Mahout.
>>>
>>> Regards,
>>>  Vimal
>>>
>>
>

Re: Current state of (dense) matrix multiplication?

Reply via email to