DRM is not for demo and is used across several Mahout jobs like RowSimilarityJob etc...
a) What's the Mahout version u r working off of? b) Have u tried using MatrixMultiplicationJob which is MapReduce based? On Tue, Jun 17, 2014 at 3:05 AM, Han Fan <[email protected]> wrote: > I have a 6kx10k matrix T and I need the result of T'*T which should be > 10kx10k. I want to do this using Mahout DistributedRowMatrix but I found > Hadoop caculates with only one mapper which is very slow. > > I digged into the source code of DistributedRowMatrix and found that the > input format of DistributedRowMatrix is CompositeInputFormat.class which > has a method named getSplits that set mapred.min.split.size to > Long.MAX_VALUE. > > So my question is that is DistributedRowMatrix only a demo to show that > matrix multiplication could be done using MapReduce but has no practical > value? Is there any way to do matrix multiplication quickly using Hadoop? > > Thanks for your time and sorry for my broken English. > >
