New submission from Raymond Hettinger:
Propose adding two functions, PyDict_GetItem_KnownHash() and
PyDict_SetItem_KnownHash().
It is reasonably common to make two successive dictionary accesses with the
same key. This results in calling the hash function twice to compute the same
result.
For example, the technique can be used to speed-up collections.Counter (see the
attached patch to show how). In that patch, the hash is computed once, then
used twice (to retrieve the prior count and to store the new count.
There are many other places in the standard library that could benefit:
Modules/posixmodule.c 1254
Modules/pyexpat.c 343 and 1788 and 1798
Modules/_json.c 628 and 1446 and 1515 and 1697
Modules/selectmodule.c 465
Modules/zipmodule.c 137
Objects/typeobject.c 6678 and 6685
Objects/unicodeobject.c 14997
Python/_warnings.c 195
Python/compile.c 1134
Python/import.c 1046 and 1066
Python/symtable 671 and 687 and 1068
A similar technique has been used for years in the Objects/setobject.c
internals as a way to eliminate unnecessary calls to PyObject_Hash() during
set-to-set and set-to-dict operations.
The benefit is biggest for objects such as tuples or user-defined classes that
have to recompute the hash on every call on PyObject_Hash().
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue21101>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com