On Monday, 4 March 2013 at 04:49:20 UTC, Andrei Alexandrescu wrote:
You're measuring the speed of a couple of tight loops. The smallest differences in codegen between them will be on the radar. Use straight for loops or "foreach (i; 0 .. limit)" for those loops...

Thanks Andrei!

I validated your analysis by doing a straight port of the C code to D, even using the same memory layout (matching malloc calls). The port was trivial, which was very reassuring. As evidence, I had to tweak only 8 lines othe C to make compiled under gdc. The mmult() function in the D version remained identical to that in the C version.

Even more reassuring was that the performance of the resulting D matched the C to within 1% tolerance (about 200-500 msec seconds slower on the D side; presumably due to runtime init).

$time ./gdc_compiled_ported_cpp_version
-1015380632 859379360 -367726792 -1548829944

real    1m32.418s
user    1m32.370s
sys 0m0.020s
$

It's a great feeling, knowing the bare metal is available if need be.

Thanks guys!

-J

Reply via email to