I have a 6kx10k matrix T and I need the result of T'*T which should be
10kx10k. I want to do this using Mahout DistributedRowMatrix but I found
Hadoop caculates with only one mapper which is very slow.
I digged into the source code of DistributedRowMatrix and found that the
input format of DistributedRowMatrix is CompositeInputFormat.class
which has a method named getSplits that set mapred.min.split.size to
Long.MAX_VALUE.
So my question is that is DistributedRowMatrix only a demo to show that
matrix multiplication could be done using MapReduce but has no practical
value? Is there any way to do matrix multiplication quickly using Hadoop?
Thanks for your time and sorry for my broken English.