On 02.08.2013 18:37, Walter Bright wrote:
On 8/2/2013 2:47 AM, Rainer Schuetze wrote:
My disassembly looks exactly the same. I don't think that a single div
in a rather long function has a lot of impact on modern processors.
I'm running
an i7, according to the instruction tables by Agner Fog, the div has
latency of
17-28 cycles and a reciprocal throughput of 7-17 cycles. If I estimate
latency of the asm snippet, I also get 16 cycles. And that doesn't
take the
additional tests and jumps into consideration.

I'm using an AMD FX-6100.

This processor seems to do a little better with the mov reg,imm operation but otherwise is similar. The DIV operation has larger worst-case latency, though (16-48 cycles).

Better to just use a power of 2 for the array sizes anyway...

