[issue38105] hash collision when hash(x) == -2 causes many calls to __eq__

2019-09-13 Thread Tim Peters
Tim Peters added the comment: Some results of the "add perturb shifted left 1 instead" approach. These come from using an old pile of Python code I have that allows for easy investigation of different collision probe strategies. - As expected, because it has no fixed points, there are no ba

[issue38105] hash collision when hash(x) == -2 causes many calls to __eq__

2019-09-12 Thread Tim Peters
Tim Peters added the comment: Following up, at least under Visual Studio for x86 "it's free" to change the code to add in `perturb` shifted left. The C source: perturb >>= PERTURB_SHIFT; i = (i*5 + (perturb << 1) + 1) & mask; compiles to this, where I added comments relating the ins

[issue38105] hash collision when hash(x) == -2 causes many calls to __eq__

2019-09-12 Thread Tim Peters
Tim Peters added the comment: A more principled change would be to replace instances of this: i = (i*5 + perturb + 1) & mask; with this: i = (i*5 + (perturb << 1) + 1) & mask; The latter spelling has no fixed points. That's easy to see: `(perturb << 1) + 1` is necessarily odd, an

[issue38105] hash collision when hash(x) == -2 causes many calls to __eq__

2019-09-12 Thread Tim Peters
Tim Peters added the comment: Something that may be slightly worth pursuing: in j = (5*j + 1) % 2**power to get a full-period sequence hitting every integer in range(2**power) exactly once, the multiplier (5) is a bit delicate (it has to be congruent to 1 mod 4), but the addend (1) onl

[issue38105] hash collision when hash(x) == -2 causes many calls to __eq__

2019-09-11 Thread Raymond Hettinger
Raymond Hettinger added the comment: [Tim Peters] > I don't believe it's worth a single cycle, overall, to avoid it. That makes complete sense. Marking this one as closed. Guido, thank for the high quality, reproducible report. hongweipeng, thank you for the casual analysis. -- nos

[issue38105] hash collision when hash(x) == -2 causes many calls to __eq__

2019-09-11 Thread Tim Peters
Tim Peters added the comment: Note that you can contrive similar cases with positive hash codes too. For example, force both hash codes to 2**60 - 2. The salient points are that 0 * 5 is congruent to 0 mod any power of 2, while -2 * 5 = -10 is congruent to -2 mod 8, so they're fixed points

[issue38105] hash collision when hash(x) == -2 causes many calls to __eq__

2019-09-11 Thread hongweipeng
hongweipeng added the comment: More than -2, -1 -4 -8 -16 and -32 will cause many calls to __eq__.In `set_add_entry` use ``` perturb >>= PERTURB_SHIFT; i = (i * 5 + 1 + perturb) & mask; ``` get the next index.In the example,mask is 7,perturb is -2. If i = 6, after execution, the value of i h

[issue38105] hash collision when hash(x) == -2 causes many calls to __eq__

2019-09-11 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- nosy: +serhiy.storchaka ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https

[issue38105] hash collision when hash(x) == -2 causes many calls to __eq__

2019-09-11 Thread Guido Imperiale
Guido Imperiale added the comment: Of course, the chances that in real life a custom object will have hash == -2 are very small. Different story however is for the actual numbers -1 and -2: In [2]: %timeit {-1, -3}

[issue38105] hash collision when hash(x) == -2 causes many calls to __eq__

2019-09-11 Thread Steven D'Aprano
Steven D'Aprano added the comment: Here's a possibly simpler version which nevertheless still shows multiple calls to __eq__ (in this case, only 6, not 13): class C(object): def __eq__(self, other): print('eq') return super().__eq__(other) def __hash__(self): r

[issue38105] hash collision when hash(x) == -2 causes many calls to __eq__

2019-09-11 Thread Guido Imperiale
Guido Imperiale added the comment: Forgot a counter-example: {C(1, 0), C(2, 0)} C((1, 0)).__hash__ C((2, 0)).__hash__ C((1, 0)).__eq__(C((2, 0))) {C((1, 0)), C((2, 0))} -- ___ Python tracker ___

[issue38105] hash collision when hash(x) == -2 causes many calls to __eq__

2019-09-11 Thread Guido Imperiale
New submission from Guido Imperiale : Take two objects where hash(a) == hash(b) == -2 exactly, but a != b, When they are inserted in a dict or set, the __eq__ method is invoked a whopping 13 times. Does this have something to do with the special case of hash(-1) = -2? class C: def __init_