Raymond Hettinger added the comment: > Isn't this already implemented?
No. >>> class A: pass >>> d = dict.fromkeys('abcdefghi') >>> a = A() >>> a.__dict__.update(d) >>> b = A() >>> b.__dict__.update(d) >>> import sys >>> [sys.getsizeof(m) for m in [d, vars(a), vars(b)]] [368, 648, 648] >>> c = A() >>> c.__dict__.update(d) >>> [sys.getsizeof(m) for m in [d, vars(a), vars(b), vars(c)]] [368, 648, 648, 648] There is no benefit reported for key-sharing. Even if you make a thousand of these instances, the size reported is the same. Here is the relevant code: _PyDict_SizeOf(PyDictObject *mp) { Py_ssize_t size, usable, res; size = DK_SIZE(mp->ma_keys); usable = USABLE_FRACTION(size); res = _PyObject_SIZE(Py_TYPE(mp)); if (mp->ma_values) res += usable * sizeof(PyObject*); /* If the dictionary is split, the keys portion is accounted-for in the type object. */ if (mp->ma_keys->dk_refcnt == 1) res += (sizeof(PyDictKeysObject) - Py_MEMBER_SIZE(PyDictKeysObject, dk_indices) + DK_IXSIZE(mp->ma_keys) * size + sizeof(PyDictKeyEntry) * usable); return res; } It looks like the fixed overhead is included for every instance of a split-dictionary. Instead, it might make sense to take the fixed overhead and divide it by the number of instances sharing the keys (averaging the overhead across the multiple shared instances): res = _PyObject_SIZE(Py_TYPE(mp)) / num_instances; Perhaps use ceiling division: res = -(- _PyObject_SIZE(Py_TYPE(mp)) / num_instances); ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28508> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com