On Mon, Nov 29, 2010 at 10:44:11AM +0000, Sad Clouds wrote: > As for calculating the hash, modern hardware has fast modulo > instructions. A very simple test on Intel Atom D510 processor, seems > to indicate a single 'interger % prime' is roughly equal to two > integer operations.
No, divisions and modulo are still very slow. According the Intel Architecture Guide, div has a latency of at least 56 cycles compared to less than 20 cycles for all integer multiplication instructions. Throughput is at best 1/4 as well. Please check your test case (and the assembler output) carefully for not hitting the unsigned division optimisation I mentioned earlier... Joerg
