On 30/08/2016 1:02 AM, rikki cattermole wrote:
On 30/08/2016 12:13 AM, Steinhagelvoll wrote:
Ok I added release and implemented the benchmark for 500 iterations,
10000 are not reasonable. I build on the 2d array with LDC:
http://pastebin.com/aXxzEdS4 (changes just in the beginning)

$ ldc2 -release -O3 nd_test.d
$ ./nd_test
12 minutes, 18 secs, 21 ms, 858 μs, and 3 hnsecs

, which is 738 seconds. Compared to (also 500 iterations)

ifort -O3 -o fort_test test.f90 && ./fort_test
 time:    107.4640    seconds


This still seems like a big difference. Is it because I don't use a
continous piece of memory, but rather a pointer to a pointer?

double[1000][] A, B, C;

void main() {
        A = new double[1000][1000];
        B = new double[1000][1000];
        C = new double[1000][1000];

        import std.conv : to;
        import std.datetime;
        import std.stdio : writeln;

        ini(A);
        ini(B);
        ini(C);

        auto r = benchmark!run_test(10000);
        auto res = to!Duration(r[0]);
        writeln(res);
}

void run_test() {
        MatMul(A, B, C);
}

void ini(T)(T mtx) {
        foreach(v; mtx) {
                v = 3.4;
        }

        foreach(i, v; mtx) {
                foreach(j, vv; v) {
                        vv += (i * j) + (0.6 * j);
                }
        }
}

void MatMul(T)(T A, T B, T C) {
        foreach(cv; C) {
                cv = 0f;
        }

        foreach(i, cv; C) {
                foreach(j, av; A[i]) {
                        foreach(k, cvv; cv) {
                                cvv += av * B[j][k];
                        }
                }
        }
}

$ ldc2 test.d -O5 -release -oftest.exe -m64
$ ./test
3 secs, 995 ms, 115 μs, and 2 hnsecs

Please verify that it is still doing the same thing that you want.

Below change is slightly faster:

foreach(i, cv; C) {


foreach(j, av; A[i]) {


auto bv = B[j];


foreach(k, cvv; cv) {


cvv += av * bv[k];


}

}


        }

Reply via email to