On Sunday, 5 October 2014 at 21:53:23 UTC, Ali Çehreli wrote:
On 10/05/2014 02:40 PM, Sativa wrote:
> foreach(i; thds) { ulong s = 0; for(ulong k = 0; k <
> iter/numThreads; k++)
The for loop condition is executed at every iteration and
division is an expensive operation. Apparently, the compiled
does some optimization when the divisor is known at compile
time.
Being 4, it is just a shift of 2 bits. Try something like 5, it
is slow even for enum.
This solves the problem:
const end = iter/numThreads;
for(ulong k = 0; k < end; k++) {
Ali
Yes, it is a common problem when doing a computation in a for
loop on the bounds. Most of the time they are constant for the
loop but the compiler computes it every iteration. When doing a
simple sum(when the loop does not do much), it becomes expensive
since it is comparable to what is happening inside the loop.
It's surprising just how slow it makes it though. One can't
really make numThreads const in the real world though as it
wouldn't optimal(unless one had a version for each number of
possible threads).
Obviously one can just move the computation outside the loop. I
would expect better results if the loops actually did some real
work.