[issue40889] Symmetric difference on dict_views is inefficient

2020-06-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I don't like a tendency of optimizing very uncommon cases. We can optimize every operation for specific types by inlining the code. But it increases maintaining cost and can have negative net effect on performance: because increasing the code size and

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-09 Thread Inada Naoki
Change by Inada Naoki : -- nosy: -inada.naoki resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-09 Thread Inada Naoki
Inada Naoki added the comment: New changeset 07d81128124f2b574808e33267c38b104b42ae2a by Dennis Sweeney in branch 'master': bpo-40889: Optimize dict.items() ^ dict.items() (GH-20718) https://github.com/python/cpython/commit/07d81128124f2b574808e33267c38b104b42ae2a -- nosy:

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-09 Thread Raymond Hettinger
Raymond Hettinger added the comment: To my eyes, the patch looks somewhat tame. It is readable enough and doesn't do anything tricky. The set object implementation aimed to never recompute the hash when it was already known. It is reasonable that other set-like operations share that

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-09 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: So it needs to add 100 lines of C code to speed up a pretty uncommon operation for arguments of a particular type. -- ___ Python tracker

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-08 Thread Dennis Sweeney
Dennis Sweeney added the comment: A demo: >>> class Int(int): ... hash_calls = 0 ... def __hash__(self): ... Int.hash_calls += 1 ... return super().__hash__() ... >>> left = {Int(1): -1, Int(2): -2, Int(3): -3, Int(4): -4, Int(5): -5, Int(6): >>> -6, Int(7): -7} >>>

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-08 Thread Dennis Sweeney
Dennis Sweeney added the comment: PR 20718 helps somewhat by only creating and hashing the tuples that wind up in the final set. Here's a benchmark: -m pyperf timeit -s "d1 = {i:i for i in range(100_000)}; d2 = {i:i|1 for i in range(100_000)}" "d1.items() ^ d2.items()" Master: 37.9

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-08 Thread Dennis Sweeney
Change by Dennis Sweeney : -- keywords: +patch pull_requests: +19928 stage: -> patch review pull_request: https://github.com/python/cpython/pull/20718 ___ Python tracker ___

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-07 Thread Raymond Hettinger
Raymond Hettinger added the comment: > What about returning another dict_items instead of a set? That API ship has already sailed. The published API guarantees that a set is returned — there is no changing that. The question is whether we care enough to provide an optimized implementation

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-07 Thread Dennis Sweeney
Dennis Sweeney added the comment: What about returning another dict_items instead of a set? As in (using the convention `d.items().mapping is d`): dict_items = type({}.items()) def __xor__(self: dict_items, other): if isinstance(other, dict_items): new =

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-07 Thread Raymond Hettinger
Raymond Hettinger added the comment: It really depends on whether the key hashes are cheap or not. -- ___ Python tracker ___ ___

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-07 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: But hashes of items are not known. Hashes of keys are known, hashes of values and items are not. We can add a special case for dict views in the set constructor and inline the hashing code for tuples, but it will be a lot of code for pretty rare case. And

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-07 Thread Raymond Hettinger
Change by Raymond Hettinger : -- nosy: +Dennis Sweeney ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-06 Thread Raymond Hettinger
New submission from Raymond Hettinger : Running "d1.items() ^ d2.items()" will rehash every key and value in both dictionaries regardless of how much they overlap. By taking advantage of the known hashes, the analysis step could avoid making any calls to __hash__(). Only the result tuples