On Friday, 31 May 2013 at 05:35:58 UTC, Juan Manuel Cabo wrote:
On 05/31/2013 02:15 AM, nazriel wrote:
On Friday, 31 May 2013 at 01:26:13 UTC, finalpatch wrote:
Recently I ported a simple ray tracer I wrote in C++11 to D. Thanks to the similarity between D and C++ it was almost a line by line translation, in other words, very very close. However, the D verson runs much slower than the C++11 version. On Windows, with MinGW GCC and GDC, the C++ version is twice as fast as the D version. On OSX, I used Clang++ and LDC, and the C++11 version was 4x faster than D verson. Since the comparison were between compilers that share the same codegen backends I suppose that's a relatively fair comparison. (flags used for GDC: -O3 -fno-bounds-check
-frelease,  flags used for LDC: -O3 -release)

I really like the features offered by D but it's the raw performance that's worrying me. From what I read D should offer similar performance when doing similar things but my own test results is not consistent with this claim. I want to know whether this slowness is inherent to the language or it's something I was not doing right (very possible
because I have only a few days of experience with D).

Below is the link to the D and C++ code, in case anyone is interested to have a look.

https://dl.dropboxusercontent.com/u/974356/raytracer.d
https://dl.dropboxusercontent.com/u/974356/raytracer.cpp

Greetings.

After few fast changes I manage to get such results:
[raz@d3 tmp]$ ./a.out
rendering time 276 ms
[raz@d3 tmp]$ ./test
346 ms, 814 μs, and 5 hnsecs


./a.out being binary compiled with clang++ ./test.cxx -std=c++11 -lSDL -O3 ./test being binary compiled with ldmd2 -O3 -release -inline -noboundscheck ./test.d (Actually I used rdmd with --compiler=ldmd2 but I omitted it because it was rather long cmd line :p)


Here is source code with changes I applied to D-code (I hope you don't mind repasting it): http://dpaste.dzfl.pl/84bb308d

I am sure there is way more room for improvements and at minimum achieving C++ performance.


You might also try changing:

            float[3] t = mixin("v[]"~op~"rhs.v[]");
            return Vec3(t[0], t[1], t[2]);

for:
            Vec3 t;
            t.v[0] = mixin("v[0] "~op~" rhs.v[0]");
            t.v[1] = mixin("v[1] "~op~" rhs.v[1]");
            t.v[2] = mixin("v[2] "~op~" rhs.v[2]");
            return t;

and so on, avoiding the float[3] and the v[] operations (which would loop, unless the compiler/optimizer unrolls them (didn't check)).

I tested this change (removing v[] ops) in Vec3 and in
normalize(), and it made your version slightly faster
with DMD (didn't check with ldmd2).

--jm


Right, I missed that. Thanks!

Now it is:

[raz@d3 tmp]$ ./a.out
rendering time 276 ms
[raz@d3 tmp]$ ./test
238 ms, 35 μs, and 7 hnsecs

So D version starts to be faster than C++ one.

Reply via email to