On Monday, 4 March 2013 at 15:46:50 UTC, jerro wrote:
A bit better version:
http://codepad.org/jhbYxEgU
I think this code is good compared to the original (there are
better algorithms).
You can make it much faster even without really changing the
algorithm. Just by reversing the order of inner two loops like
this:
void matrixMult2(in int[][] m1, in int[][] m2, int[][] m3) pure
nothrow {
foreach (immutable i; 0 .. m1.length)
foreach (immutable k; 0 .. m2[0].length)
foreach (immutable j; 0 .. m3[0].length)
m3[i][j] += m1[i][k] * m2[k][j];
}
you can make the code much more cache friendly (because now you
aren't iterating any matrix by column in the inner loop) and
also allow the compiler to do auto vectorization. matrixMul2()
takes 2.6 seconds on my machine and matrixMul()takes 72 seconds
(both compiled with gdmd -O -inline -release -noboundscheck
-mavx).
This isn't really relevant to the comparison with C++ in this
thread, I just thought it may be useful for anyone writing
matrix code.
forgot to set m3's elements to zero before adding to them:
void matrixMult2(in int[][] m1, in int[][] m2, int[][] m3) pure
nothrow {
foreach (immutable i; 0 .. m1.length)
{
m3[i][] = 0;
foreach (immutable k; 0 .. m2[0].length)
foreach (immutable j; 0 .. m3[0].length)
m3[i][j] += m1[i][k] * m2[k][j];
}
}
This does not make the function noticeably slower.