Tim Peters <[email protected]> added the comment:
Following up, at least under Visual Studio for x86 "it's free" to change the
code to add in `perturb` shifted left. The C source:
perturb >>= PERTURB_SHIFT;
i = (i*5 + (perturb << 1) + 1) & mask;
compiles to this, where I added comments relating the instructions to the
source code:
lea rax,[r14+r14*4] # rax = i*5
shr r12,5 # r12(perturb) >>= 5
lea r14,[r12*2+1] # r14(i) = (perturb << 1) + 1
add r14,rax # r14(i) += i*5
and r14,rbx # r14(i) &= mask
Good job! There are actually fewer machine instructions than C operators.
As noted, this change would eliminate the possibility of fixed points from the
start, so would eliminate the symptoms in this specific bug report.
Which I don't much care about ;-) But I _do_ care about that, in general, the
early stages of collision resolution for small tables can "frequently" visit
slots more than once. This is true even if the hashes aren't identical. And
it would remain true even with the change (cycles can still exist - that there
are no fixed points just means there are no more cycles of the minimum possible
length 1).
Improving that is worth some effort, but not much. In general, hash codes
aren't identical, and in small tables redundant slot visits just cost the time
to read the hash code from cache and re-deduce that it's not equal to what
we're looking for (__eq__ isn't typically called).
In larger tables redundant slot visits in the early stages are too rare to care
about.
In small tables, it's worth something to guarantee that the first probe after a
collision can't hit exactly the same slot again (in a random model, there's a
12.5% chance of that happening now in the smallest tables - 12.5% is 1 in 8 -
and this bug report contrived cases where the chance is 100% across a dozen
iterations).
Anyone up a for a world of tedious benchmarking? ;-)
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue38105>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com