New submission from David Su: In dictobject.c, the function dictviews_and performs a very expensive set creation operation on the keys of the self dict object:
>From Python 2.7: ... static PyObject* dictviews_and(PyObject* self, PyObject *other) { PyObject *result = PySet_New(self); ... tmp = PyObject_CallMethod(result, "intersection_update", "(O)", other); ... Similar logic can be found in the Python 3 code as well. For large dicts (~10 000 000 keys), it takes a significant amount of time to copy the keys into a new set, and also wastes a lot of memory. This issue is amplified when the intersection operation is performed many times. This implementation uses O(len(self)) time and space, while it is expected to use O(min(len(self), len(other)) time and space. Solution 1: Manually copy the keys of the dict into a set, and perform all intersection operations on that set. However, still wastes lots of memory. Solution 2 (Recommended): Port the intersection logic from the set class to the dict view class. ---------- components: Library (Lib) messages: 270836 nosy: David Su2 priority: normal severity: normal status: open title: dict viewkeys intersection slow for large dicts type: performance versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue27575> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com