New submission from David Su:

In dictobject.c, the function dictviews_and performs a very expensive set 
creation operation on the keys of the self dict object:

>From Python 2.7:
...
static PyObject*
dictviews_and(PyObject* self, PyObject *other)
{
    PyObject *result = PySet_New(self);
...
    tmp = PyObject_CallMethod(result, "intersection_update", "(O)", other);
...

Similar logic can be found in the Python 3 code as well.


For large dicts (~10 000 000 keys), it takes a significant amount of time to 
copy the keys into a new set, and also wastes a lot of memory. This issue is 
amplified when the intersection operation is performed many times.

This implementation uses O(len(self)) time and space, while it is expected to 
use O(min(len(self), len(other)) time and space.


Solution 1: Manually copy the keys of the dict into a set, and perform all 
intersection operations on that set. However, still wastes lots of memory. 

Solution 2 (Recommended): Port the intersection logic from the set class to the 
dict view class.

----------
components: Library (Lib)
messages: 270836
nosy: David Su2
priority: normal
severity: normal
status: open
title: dict viewkeys intersection slow for large dicts
type: performance
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27575>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to