Am 16.02.2012 19:24, schrieb Jim J. Jewett: > > > PEP author Mark Shannon wrote > (in > http://mail.python.org/pipermail/python-dev/attachments/20120208/05be469a/attachment.txt): > >> ... allows ... (the ``__dict__`` attribute of an object) to share >> keys with other attribute dictionaries of instances of the same class. > > Is "the same class" a deliberate restriction, or just a convenience > of implementation?
It's about the implementation: the class keeps a pointer to the key set. A subclass has a separate pointer for that. > I have often created subclasses (or even families > of subclasses) where instances (as opposed to the type) aren't likely > to have additional attributes. These would benefit from key-sharing > across classes, but I grant that it is a minority use case that isn't > worth optimizing if it complicates the implementation. In particular, the potential savings are small: the instances of the subclass will share the key sets per-class. So if you have S subclasses, you could save up to S keysets, whereas you are already saving N-S-1 keysets (assuming you have a total of N objects across all classes). > Have you timed not storing the hash (in the dict) at all, at least for > (unicode) str-only dicts? Going to the string for its own cached hash > breaks locality a bit more, but saves 1/3 of the memory for combined > tables, and may make a big difference for classes that have relatively > few instances. I'd be in favor of that, but it is actually an unrelated change: whether or not you share key sets is unrelated to whether or not str-only dicts drop the cached hash. Given a dict, it may be tricky to determine whether or not it is str-only, i.e. what layout to use. >> Reduction in memory use is directly related to the number of dictionaries >> with shared keys in existence at any time. These dictionaries are typically >> half the size of the current dictionary implementation. > > How do you measure that? The limit for huge N across huge numbers > of dicts should be 1/3 (because both hashes and keys are shared); I > assume that gets swamped by object overhead in typical small dicts. It's more difficult than that. He also drops the smalltable (which I think is a good idea), so accounting how this all plays together is tricky. >> If a table is split the values in the keys table are ignored, >> instead the values are held in a separate array. > > If they're just dead weight, then why not use them to hold indices > into the array, so that values arrays only have to be as long as > the number of keys, rather than rounding them up to a large-enough > power-of-two? (On average, this should save half the slots.) Good idea. However, how do you track per-dict how large the table is? Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com