[issue13903] New shared-keys dictionary implementation
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 10e8b97d0fd7 by Antoine Pitrou in branch 'default': Make the reference counting of dictkeys objects participate in refleak hunting http://hg.python.org/cpython/rev/10e8b97d0fd7 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Benjamin Peterson benja...@python.org: -- resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: Change insertdict to follow normal (non-stealing) ref-counting behaviour which fixes possible leakage. Patch attached. -- Added file: http://bugs.python.org/file25422/insertdict.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset c5fd332e5857 by Benjamin Peterson in branch 'default': change insertdict to not steal references (#13903) http://hg.python.org/cpython/rev/c5fd332e5857 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: Decref cached-keys when type is deallocated. Patch attached. -- Added file: http://bugs.python.org/file25381/cached_keys.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset a3beae842f13 by Benjamin Peterson in branch 'default': decref cached keys on type deallocation (#13903) http://hg.python.org/cpython/rev/a3beae842f13 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Antoine Pitrou pit...@free.fr added the comment: New changeset a3beae842f13 by Benjamin Peterson in branch 'default': decref cached keys on type deallocation (#13903) http://hg.python.org/cpython/rev/a3beae842f13 Is there any way to detect / avoid leaks on this separate refcounting scheme? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Antoine Pitrou pit...@free.fr added the comment: This patch integrates the dictkeys' refcounting into the refcount checking framework. Seems to work ok, but it would be better if someone more acquainted with the code could confirm it. -- Added file: http://bugs.python.org/file25385/dkdebug.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: Failing to maintain GC tracking in setdefault and copy (for split-tables) Patch attached -- Added file: http://bugs.python.org/file25339/gc_tracking.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 507a6703d6a3 by Benjamin Peterson in branch 'default': fix dict gc tracking (#13903) http://hg.python.org/cpython/rev/507a6703d6a3 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: Failed to differentiate between failure and error in make_split_table(). Patch attached -- Added file: http://bugs.python.org/file25340/make_split_table_error.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Giampaolo Rodola' g.rod...@gmail.com: -- nosy: -giampaolo.rodola ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset b044e0568be2 by Martin v. Loewis in branch 'default': Account for shared keys in type's __sizeof__ (#13903). http://hg.python.org/cpython/rev/b044e0568be2 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 5d5b72a71898 by Benjamin Peterson in branch 'default': distiguish between refusing to creating shared keys and error (#13903) http://hg.python.org/cpython/rev/5d5b72a71898 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Added file: http://bugs.python.org/file25313/73423916a242.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: I've updated the repository and uploaded a new patch in response to Benjamin's review. (And the contributor form is in the post). One remaining issue is the return value of __sizeof__(). If it is an int, then it cannot accurately reflect the memory use, but returning a float may seem rather surprising. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 6e5855854a2e by Benjamin Peterson in branch 'default': Implement PEP 412: Key-sharing dictionaries (closes #13903) http://hg.python.org/cpython/rev/6e5855854a2e -- nosy: +python-dev resolution: - fixed stage: patch review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Benjamin Peterson benja...@python.org added the comment: Okay. I committed the latest patch. Subtleties like __sizeof__ can be worked out as people start using it. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Yury Selivanov yseliva...@gmail.com added the comment: Mark, did you add the test that your patch initially was failing with? http://mail.python.org/pipermail/python-dev/2012-February/116605.html -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: I fixed it back then, but didn't add the test. It subsequently regressed. Should know better. Patch (with test this time) attached. -- resolution: fixed - status: closed - open Added file: http://bugs.python.org/file25318/str_subclass.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 34b6998efd2c by Benjamin Peterson in branch 'default': fix instance dicts with str subclasses (#13903) http://hg.python.org/cpython/rev/34b6998efd2c -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Martin v. Löwis mar...@v.loewis.de added the comment: Mark, can you please submit a contributor form? -- nosy: +loewis ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Yury Selivanov yseliva...@gmail.com: -- nosy: +Yury.Selivanov ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Raymond Hettinger raymond.hettin...@gmail.com added the comment: Moving the assigned to to nobody. I won't have a chance for a thorough review for another ten days. Hopefully, someone else will have a chance to review it before then. -- assignee: rhettinger - ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: I don't really understand your objection to changing the method-cache size. As I said, I can remove the change, but that will cause the performance regression that Antoine noticed. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: How about changing the method-cache size in a separate patch? On my machine, reducing the method-cache size from 2**10 to 2**9 results in a very small improvement in performance (with the original dict). That way we don't get a performance regression with the new dict and the patch is better focussed. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Antoine Pitrou pit...@free.fr added the comment: I'm not bothered by the regression in silent_logging, as it is a micro benchmark with a very short running time. I'm not concerned about the micro-benchmark itself but the fact that it might hint at a wider problem. Also, I don't get your remark about it running in a short time. Your patch AFAICT doesn't need any warm up period to exhibit any improvements. Reducing the method-cache size from 2**10 to 2**9 allows the working set to fit better in the cache. This fixes the regression in mako, but makes OO programs that use few objects (such as richards) a bit slower. I don't think we should reduce the size of the method cache. 1024 is not a very large number, and might even be too small for complex OO programs. I also think that, apart from the dict storage changes, your patch should strive not to change any other tunables. Otherwise we're really comparing apples to oranges. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: I'm not bothered by the regression in silent_logging, as it is a micro benchmark with a very short running time. I'm not concerned about the micro-benchmark itself but the fact that it might hint at a wider problem. Or it might not. Micro-benchmarks produce micro-optimisations. That's why I dislike them. Also, I don't get your remark about it running in a short time. Your patch AFAICT doesn't need any warm up period to exhibit any improvements. What I mean is that the runtime is so short, no one would notice any change, so who cares? Reducing the method-cache size from 2**10 to 2**9 allows the working set to fit better in the cache. This fixes the regression in mako, but makes OO programs that use few objects (such as richards) a bit slower. I don't think we should reduce the size of the method cache. 1024 is not a very large number, and might even be too small for complex OO programs. not very large or too small, by what metric? The optimum size (for speed) of the method cache depends on the size of hardware data cache, the complexity of the program, and many other factors. Attempt to reason about it are pretty much futile. Empiricism is the only way to go. I also think that, apart from the dict storage changes, your patch should strive not to change any other tunables. Otherwise we're really comparing apples to oranges. If the implementation changes, shouldn't the tunable parameters be retuned? Cheers, Mark. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Antoine Pitrou pit...@free.fr added the comment: Also, I don't get your remark about it running in a short time. Your patch AFAICT doesn't need any warm up period to exhibit any improvements. What I mean is that the runtime is so short, no one would notice any change, so who cares? None of the benchmarks used here are real-world workloads, so you might as well claim that they are all irrelevant. But then we'll have a hard time assessing the consequences of your patch. I don't think we should reduce the size of the method cache. 1024 is not a very large number, and might even be too small for complex OO programs. not very large or too small, by what metric? By the metric of the number of classes and methods in a complex OO application (for example something based on Twisted or SQLAlchemy). I also think that, apart from the dict storage changes, your patch should strive not to change any other tunables. Otherwise we're really comparing apples to oranges. If the implementation changes, shouldn't the tunable parameters be retuned? Only if there's a reasoning behind it. Perhaps the retuning would have given the same results without the rest of your patch. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Raymond Hettinger raymond.hettin...@gmail.com added the comment: [Antoine] I also think that, apart from the dict storage changes, your patch should strive not to change any other tunables. I agree. Please keep the patch focused on the single task, the shared keys. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Added file: http://bugs.python.org/file25096/372d0bca85ae.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: I'm not bothered by the regression in silent_logging, as it is a micro benchmark with a very short running time. The regression in mako is, I think, caused by competition for the data cache between the new dict implementation and the method-cache used by _PyType_Lookup. Although the new-dict uses less memory overall, the size of a single split-table dict (keys + value) is larger than a combined table. Reducing the method-cache size from 2**10 to 2**9 allows the working set to fit better in the cache. This fixes the regression in mako, but makes OO programs that use few objects (such as richards) a bit slower. Compared with tip, the new-dict implementation is 4% slower, using 7% less memory for mako. 6% slower, using 5% less memory for richards. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Antoine Pitrou pit...@free.fr added the comment: The latest patch has a significant (negative) effect on some benchmarks: ### silent_logging ### Min: 0.057927 - 0.068228: 1.18x slower Avg: 0.058218 - 0.068660: 1.18x slower Significant (t=-36.06) ### mako ### Min: 0.118240 - 0.140451: 1.19x slower Avg: 0.120145 - 0.142422: 1.19x slower Significant (t=-61.66) These regressions should be fixed before going any further, IMO. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Added file: http://bugs.python.org/file24766/257e16e71654.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Jim Jewett jimjjew...@gmail.com added the comment: As of Feb 28, 2012, the PEP mentions an additional optimization of storing the values in an array indexed by (effectively) key insertion order, rather than key position. (Alternative Implementation) It states that this would reduce memory usage for the values array by 1/3. 1/3 is a worst-case measurement; average is 1/2. (At savings of less than 1/3, the keys would resize, to initial savings of 2/3. And yes, that means in practice, the average savings would be greater than half, because the frequency of dicts of size N decreases with N.) I presume you mean to allocate a values array of size == actual keys rather than size == usable keys. Actually, I meant size==maximum(keys in use for this dict), so that for objects with a potentially complex lifecycle, those instances that had not yet needed the new attributes would not need to allocate space for them. But I see now that just allocating space for each of the potential keys is indeed simpler. And I suppose that a class which won't eventually need all the attributes for every instance is an unusual case. So to get beneath 2/3 without lots of reallocation probably requires knowing when the key set is likely to be complete, and that is indeed more complex than the current changes. (That said, you have left code in for immutable keys, so there may be cases where this isn't so hard after all.) Making the values array smaller than the the number of usable keys causes a number of issues: 1. The values_size will need to be kept in the dict, not in the keys, That was indeed true for my proposal. If you just allocate 2/3, then you don't need to store the value, unless you want to be lazy about reallocating when the keys grow. Even then, there are few enough potential sizes to fit the information in a byte, or we could get the info for free *if* the dict were already timestamped or versioned. It states that the keys table would need an additional values_size field, but in the absence of dummies, this is just ma_used. I was mixing two structures in my mind. The current key count (which is sufficient for a new values array) is actually USABLE_FRACTION(dk_size) - dk_free. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Jim Jewett jimjjew...@gmail.com added the comment: On Fri, Mar 9, 2012 at 12:13 PM, Jim Jewett So to get beneath 2/3 without lots of reallocation probably requires knowing when the key set is likely to be complete, and that is indeed more complex than the current changes. (That said, you have left code in for immutable keys, so there may be cases where this isn't so hard after all.) On second thought, avoiding reallocation doesn't have to be perfect -- just good enough to work on average. For a *normal* class, the keyset won't change after the first instance has completed its __init__. Which of course again leads to autoslots and a normally NULL extra_dict. And having done that, it makes sense to combine the (normal) instance dict with the type dict to simplify the lookup chain, but ... that is probably too aggressive for the 3.3 schedule. One silver lining to your patch plus hash randomization is that that 3.4 should have a pretty free hand with regards to internal details. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Jim Jewett jimjjew...@gmail.com added the comment: As of Feb 28, 2012, the PEP mentions an additional optimization of storing the values in an array indexed by (effectively) key insertion order, rather than key position. (Alternative Implementation) It states that this would reduce memory usage for the values array by 1/3. 1/3 is a worst-case measurement; average is 1/2. (At savings of less than 1/3, the keys would resize, to initial savings of 2/3. And yes, that means in practice, the average savings would be greater than half, because the frequency of dicts of size N decreases with N.) It states that the keys table would need an additional values_size field, but in the absence of dummies, this is just ma_used. Note that this would also simplify resizing, as the values arrays would not have to be re-ordered, and would not have to be modified at all unless/until that particular instance received a value for a position beyond its current size. -- nosy: +Jim.Jewett ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: Jim Jewett wrote: Jim Jewett jimjjew...@gmail.com added the comment: As of Feb 28, 2012, the PEP mentions an additional optimization of storing the values in an array indexed by (effectively) key insertion order, rather than key position. (Alternative Implementation) It states that this would reduce memory usage for the values array by 1/3. 1/3 is a worst-case measurement; average is 1/2. (At savings of less than 1/3, the keys would resize, to initial savings of 2/3. And yes, that means in practice, the average savings would be greater than half, because the frequency of dicts of size N decreases with N.) I presume you mean to allocate a values array of size == actual keys rather than size == usable keys. Making the values array smaller than the the number of usable keys causes a number of issues: 1. The values_size will need to be kept in the dict, not in the keys, which would make the dict larger for size 3 or 5, the same size for size 2 or 4, but it would save memory for sizes 6-8 (see below). So it might not save memory at all, it depends on the distribution of the sizes of shared-key dicts. 2. dk_usable would need to be reverted to dk_fill (c.f ma_fill) in order to quickly allocate values arrays. This would slow down the resizing check, which is now done before insertion, so might be slower. (not really a problem, but it might slow down inserts) 3. An extra check needs to be performed for each read to ensure the index is in-bounds (again not really a problem, but it might slow down reads) Comparative sizes of values array (including size field): Items ProposedAlternativeOther AlternativeDelta (size == usuable) (size == keys (+1)) 1 4 3 2 -1 2 4 3 3 0 3 4 3 4 +1 4 8 5 5 0 5 8 5 6 +1 6 16 10 7 -3 7 16 10 8 -2 8 16 10 9 -1 9 16 1010 0 10 16 1011 +1 12 32 2113 -8 14 32 2115 -6 The memory savings of the two alternative implementations are broadly similar. Clearly, either of the alternatives are attractive in terms of memory use. I think it is definitely worth investigating further, but I would like to get the proposed implementation into CPython first. It states that the keys table would need an additional values_size field, but in the absence of dummies, this is just ma_used. values_size != ma_used values_size is the size of the values array, not the number of values. Don't forget deletions or the case where items are inserted in a different order from that of another dict sharing the same keys. Although there are no dummies, (key != NULL, value == NULL) is a legal pair representing a missing or deleted value. Cheers, Mark. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Added file: http://bugs.python.org/file24670/49b7e7e4a27c.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Added file: http://bugs.python.org/file24510/691ce331f955.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Removed file: http://bugs.python.org/file24465/e50db1b7ad7b.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Added file: http://bugs.python.org/file24465/e50db1b7ad7b.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Removed file: http://bugs.python.org/file24456/20702d1acf17.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: -- hgrepos: +112 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Added file: http://bugs.python.org/file24453/a9138aba7896.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Removed file: http://bugs.python.org/file24360/6a21f3b35e20.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Removed file: http://bugs.python.org/file24453/a9138aba7896.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Added file: http://bugs.python.org/file24454/1f703b2607af.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Antoine Pitrou pit...@free.fr added the comment: Looking at your latest patch, I worry about any deletion +(including pop popitem) causes a split table to become a combined table. Why wouldn't you use a dummy pointer (such as ((PyObject *) 1)) to signal deleted slots? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: Looking at your latest patch, I worry about any deletion +(including pop popitem) causes a split table to become a combined table. Why wouldn't you use a dummy pointer (such as ((PyObject *) 1)) to signal deleted slots? In fact here is no need for a dummy pointer. When deleting from a split-table, the value can just be set to NULL, the resulting key-NULL pair is legal in a split-table. Your suggestion doesn't make the code any more complex, so I've included it. In practice, it will very rare that a deletion occurs in a split table (since they are only used for attribute dictionaries). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Removed file: http://bugs.python.org/file24454/1f703b2607af.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Added file: http://bugs.python.org/file24455/bc286099ce9a.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Removed file: http://bugs.python.org/file24455/bc286099ce9a.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Added file: http://bugs.python.org/file24456/20702d1acf17.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: -- hgrepos: -109 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: Raymond Hettinger wrote: Raymond Hettinger raymond.hettin...@gmail.com added the comment: Changing dictionaries is a big deal. You're changing many pieces at once (not a good idea) including changing tunable parameters that are well-studied (I spent a month testing whether 5/8 was better idea that 2/3 for resizing or when the ideal small dict size was 4, 8, or 16). You're changing the meaning of the fields in dictobject.h which will likely break any code that relied on those. The ideas may be good ones but they warrant a good deal of thought. Dicts weren't just slapped together -- the current code is the product to two decades of tweaking by engineers who devoted significant time to the task. It would be easy to unknowingly undo some of their work. OK. I'll write a PEP. By the way, I'm trying not changing the tunable parameters for the unshared-keys case, just the shared-keys case. Of course, they do interact with each other. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Antoine Pitrou pit...@free.fr added the comment: As Victor I think the tunables could be changed separately, unless they truely get in the way of the shared-keys optimization. By the way, there will need a protection for the case where instances are used as bags of key/value pairs (dict-alikes with attribute access). Perhaps bound the size of the keys array to 100 entries and then switch into unshared mode. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Jesús Cea Avión j...@jcea.es: -- nosy: +jcea ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Gregory P. Smith g...@krypto.org added the comment: FYI - I strongly support this type of work to reduce memory use of the Python interpreter! :) Also, yes, constant changing should be separate from this change but are worth occasionally re-measuring and justifying as common computer architectures have changed since the original decisions were made. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Raymond Hettinger raymond.hettin...@gmail.com added the comment: This would likely require a PEP before having a chance of being considered for inclusion. -- nosy: +rhettinger ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Mark Shannon m...@hotpy.org added the comment: Does this really need a PEP? There is no new language feature and no change to any API. It is just saving some memory. The only possible issue is changing dict repr() ordering, but issue 13703 will do that anyway. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Raymond Hettinger raymond.hettin...@gmail.com: -- assignee: - rhettinger ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Raymond Hettinger raymond.hettin...@gmail.com added the comment: Changing dictionaries is a big deal. You're changing many pieces at once (not a good idea) including changing tunable parameters that are well-studied (I spent a month testing whether 5/8 was better idea that 2/3 for resizing or when the ideal small dict size was 4, 8, or 16). You're changing the meaning of the fields in dictobject.h which will likely break any code that relied on those. The ideas may be good ones but they warrant a good deal of thought. Dicts weren't just slapped together -- the current code is the product to two decades of tweaking by engineers who devoted significant time to the task. It would be easy to unknowingly undo some of their work. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
New submission from Mark Shannon m...@hotpy.org: The proposed dictionary implementation allows sharing of keys hashes between dictionaries. This leads to substantial memory savings for object-oriented programs. For non-OO programs the impact is negligible. -- components: Interpreter Core hgrepos: 109 messages: 152229 nosy: Mark.Shannon priority: normal severity: normal status: open title: New shared-keys dictionary implementation type: performance versions: Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Georg Brandl ge...@python.org: -- keywords: +patch Added file: http://bugs.python.org/file24357/061f8573af54.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Mark Shannon m...@hotpy.org: Added file: http://bugs.python.org/file24360/6a21f3b35e20.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Georg Brandl ge...@python.org: Removed file: http://bugs.python.org/file24357/061f8573af54.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Antoine Pitrou pit...@free.fr: -- nosy: +benjamin.peterson, pitrou stage: - patch review versions: +Python 3.3 -Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Terry J. Reedy tjre...@udel.edu added the comment: In the initial comment, 'Dummy' to 'Deleted' here but only here: - Holds an active (key, value) pair. Active can transition to Dummy + Holds an active (key, value) pair. Active can transition to Deleted Im Lib/test/test_pprint.py def test_set_reprs(self): ... # Consequently, this test is fragile and ... +# XXX So why include this test in the first place? Raymond, I believe you added this 44927 and revised for 3.x in 45067. I imagine it will also be a problem with randomized hashes. Should it be removed or somehow revised? -- nosy: +terry.reedy ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Giampaolo Rodola' g.rod...@gmail.com: -- nosy: +giampaolo.rodola ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Philip Jenvey pjen...@underboss.org: -- nosy: +pjenvey ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by Gregory P. Smith g...@krypto.org: -- nosy: +gregory.p.smith ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
STINNER Victor victor.stin...@haypocalc.com added the comment: I see two unrelated parts in your patch: - change dictionary structure in memory - change many constants linked to optimization: PyDICT_MAXFREELIST: 80-40, 2/3-5/8, etc. You may open a new issue for the second part, except if I am wrong and you need to change constants for the first part? (I don't understand why you changed constants and how you chose new values.) -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13903] New shared-keys dictionary implementation
Changes by John O'Connor tehj...@gmail.com: -- nosy: +jcon ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13903 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com