[issue30671] dict: improve lookup function

2017-06-18 Thread Dmitry Rubanovich
Dmitry Rubanovich added the comment: A few notes on my motivation and other comments I made(not necessarily in the order of significance): 1. I started looking at this because of the fix in Issue28201. It effectively improves on the difference between print_collision_counts

[issue30671] dict: improve lookup function

2017-06-16 Thread Dmitry Rubanovich
Dmitry Rubanovich added the comment: Changing the title, to remove "simplify", since the discussion revealed that the potential improvement would have to be treated as a special case and, therefore, would not simplify the code. -- title: dict: simplify and improve looku

[issue30671] dict: simplify and improve lookup function

2017-06-15 Thread Dmitry Rubanovich
Dmitry Rubanovich added the comment: Tim, I am not testing randomly. The scripts calculate secondary hash for **each** value in a ring to see how often this results in duplicate (or triplicate, etc.) values. And they do it for each (mod 2**i) ring for i between 8 and 20. Then (mostly as an

[issue30671] dict: simplify and improve lookup function

2017-06-15 Thread Dmitry Rubanovich
Dmitry Rubanovich added the comment: Yes, I do understand it. But the statement in lines 166, 167: "For any initial j in range(2**i), repeating that 2**i times generates each int in range(2**i) exactly once" does not hold when the perturb is added. 2**i times will not be enough t

[issue30671] dict: simplify and improve lookup function

2017-06-15 Thread Dmitry Rubanovich
Dmitry Rubanovich added the comment: I am looking at the 3.6.1 code. I am not trying to simulate the index lookup as the lookup function would do it. There is no point in that. The point of the script is to detect collisions among perturbed secondary indexes rather than between primary

[issue30671] dict: simplify and improve lookup function

2017-06-15 Thread Dmitry Rubanovich
New submission from Dmitry Rubanovich: lookdict_index() (and the rest of the files in dictobject.c) are using unnecessarily complicated perturb mechanism. And, in fact, it's slower than the simpler case. Instead of this: for (size_t perturb = hash;;) { perturb >>= PERTURB_S

[issue29304] dict: simplify lookup function

2017-06-14 Thread Dmitry Rubanovich
Dmitry Rubanovich added the comment: lookdict_index() (and the rest of the files in dictobject.c) are using unnecessarily complicated perturb mechanism. And, in fact, it's slower than the simpler case. Instead of this: for (size_t perturb = hash;;) { perturb >>= PERTURB_S