DRM is not for demo and is used across several Mahout jobs like
RowSimilarityJob etc...

a) What's the Mahout version u r working off of?
b) Have u tried using MatrixMultiplicationJob which is MapReduce based?


On Tue, Jun 17, 2014 at 3:05 AM, Han Fan <[email protected]> wrote:

> I have a 6kx10k matrix T and I need the result of T'*T which should be
> 10kx10k. I want to do this using Mahout DistributedRowMatrix but I found
> Hadoop caculates with only one mapper which is very slow.
>
> I digged into the source code of DistributedRowMatrix and found that the
> input format of DistributedRowMatrix  is CompositeInputFormat.class which
> has a method named getSplits that set mapred.min.split.size to
> Long.MAX_VALUE.
>
> So my question is that is DistributedRowMatrix only a demo to show that
> matrix multiplication could be done using MapReduce but has no practical
> value? Is there any way to do matrix multiplication quickly using Hadoop?
>
> Thanks for your time and sorry for my broken English.
>
>

Reply via email to