Note that if you look at the implementation for `proc rand*(r: var Rand; max: 
Natural)`, it is possible to do this more quickly in a loop if `max` does not 
change in said loop. The `mod` is constant over the loop and `randMax mod 
Ui(max)` can be computed just once instead of every function call.

There is still a final range reduction `mod`. I am pretty sure that can be 
turned into a floating point multiply (probably making the result even faster 
than the gcc-const-optimized variant). Of course, dirtying the FP 
registers/state can make all your context switches slower, but that probably 
will not matter in numerics heavy workloads.

Reply via email to