On 02.08.2013 18:37, Walter Bright wrote:
On 8/2/2013 2:47 AM, Rainer Schuetze wrote:
My disassembly looks exactly the same. I don't think that a single div
in a rather long function has a lot of impact on modern processors.
an i7, according to the instruction tables by Agner Fog, the div has
17-28 cycles and a reciprocal throughput of 7-17 cycles. If I estimate
latency of the asm snippet, I also get 16 cycles. And that doesn't
additional tests and jumps into consideration.
I'm using an AMD FX-6100.
This processor seems to do a little better with the mov reg,imm
operation but otherwise is similar. The DIV operation has larger
worst-case latency, though (16-48 cycles).
Better to just use a power of 2 for the array sizes anyway...