Tim Peters <t...@python.org> added the comment:

A more principled change would be to replace instances of this:

    i = (i*5 + perturb + 1) & mask;

with this:

    i = (i*5 + (perturb << 1) + 1) & mask;

The latter spelling has no fixed points.  That's easy to see:  `(perturb << 1) 
+ 1` is necessarily odd, and then `i*5 + ODD` is even when `i` is odd, and vice 
versa.

I had hoped that the time for a new shift could overlap with the multiply 
latency, but no such luck.  At least Visual Studio didn't multiply to begin 
with:  it first computes `i*4 + perturb` cheaply with a single LEA instruction, 
then adds 1 (INC), then adds in `i` again.

I expect it would be able to fold in a "free" shifted add of perturb with 
another LEA, so the latter spelling isn't necessarily more expensive.  But I'm 
generally loathe to increase operation counts relying on clever compilers to 
exploit processor-specific tricks.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue38105>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to