On Friday, 6 November 2015 at 11:38:29 UTC, Marc Schütz wrote:
On Friday, 6 November 2015 at 11:37:22 UTC, Marc Schütz wrote:
Ok, benchA and benchB have the same assembler code generated.
However, I _can_ reproduce the slowdown albeit on average only
20%-40%, not a factor of 10.
Forgot to add that this is on Linux x86_64, so that probably
explains the difference.
It turns out that it's always the first tested function that's
slower. You can test this by switching benchA and benchB in
the call to benchmark(). I suspect the reason is that the OS
is paging in the code the first time, and we're actually
seeing the cost of the page fault. If you a second round of
benchmarks after the first one, that one shows more or less
the same performance for both functions.
I tested swapping around the functions on windows x86 and I still
get the same slowdown with the default initializer. Still
basically the same running speed of both functions on windows
x64. Interestingly enough the slowdown disappear if I add another
float variable to the structs. This causes the assembly to change
to using different instructions so I guess that is why. Also it
only seems to affect small structs with floats in them. If I
change the memebers to int both versions run at the same speed on
x86 aswell.