Hey Shannon, I'm replying via phone, so apologies in advance for brevity:
If you have a DRM (A) which is n rows by m columns, and another DRM (B) which is m rows by p columns, there is *no single method* on DRM which computes A*B (a sensible matrix with n rows by p columns). To compute this, you would run A.transpose().times(B). On the other hand, if you already have a matrix (call it At) with m rows by n columns, then At.times(B) will compute a matrix with n rows and p columns in one method call (and one MR pass) whose entries are exactly the same as taking the true matrix multiplication of the transpose of At times B. Any time you use DRM.times(), you are required to have both DRM instances have the same number of rows (*not* number of columns of the first equals the number of rows of the second). In fact, as Dmitriy points out, the have to have the same number of InputSplits as well (which is easily achieved by having both be created in MR jobs with the same # of reducers). -jake On Jan 6, 2011 1:53 PM, "Shannon Quinn" <[email protected]> wrote: > Matrix A has N rows (each of which has cardinality M_A), and Matrix B has > N rows (each of whi... I suppose this is where I get confused. I thought, by definition, matrix A has dimensions (n by m), and matrix B has dimensions (m by p), and the resulting matrix is (n by p). I saw in the implementation that it cleverly uses the transpose of A such that just the row vectors are needed, but my confusion comes from the fact that I don't see an explicit transpose before the times() job gets going. So, in a toy example, A = [3 by 2], B = [2 by 2], it looks to me as if the three rows of A are being sent to the MR job with the two rows of B, which doesn't make any sense. I know there should be a transpose of A somewhere but I don't see it. Unless the assumption is that the user calls transpose() before calling times()? Which doesn't make any sense either since I've used this job just fine. I know I'm missing something simple...thanks for your help. Also: I'll shelve the general DRM rewrite patch, then, for the time being. You make good points, and there are other patches I should work on in the meantime :) (though I could just experiment with 0.21 to see how well that works) Shannon > There are thus N pairs of > vectors {A_i, B_i}, and if you take MatrixSum_{i=1,N} (A_i^T x B_i...
