[issue30040] new empty dict can be more small

2019-03-15 Thread Inada Naoki
Inada Naoki added the comment: > I do not like how much code is needed for such minor optimization. PR 12307 is +46 lines. Benefit is about 10ns when first insertion. $ cpython/python.opt -m perf timeit --compare-to master/python.opt --python-names master:empty-dict2 --duplicate 100

[issue30040] new empty dict can be more small

2019-03-14 Thread Inada Naoki
Inada Naoki added the comment: New changeset 3fe7fa316f74ed630fbbcdf54564f15cda7cb045 by Inada Naoki in branch 'master': bpo-30040: update news entry (GH-12324) https://github.com/python/cpython/commit/3fe7fa316f74ed630fbbcdf54564f15cda7cb045 --

[issue30040] new empty dict can be more small

2019-03-14 Thread Inada Naoki
Change by Inada Naoki : -- pull_requests: +12297 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue30040] new empty dict can be more small

2019-03-14 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I do not like how much code is needed for such minor optimization. -- ___ Python tracker ___

[issue30040] new empty dict can be more small

2019-03-14 Thread Inada Naoki
Inada Naoki added the comment: Benchmark result is here. https://github.com/methane/sandbox/tree/master/2019.01/empty-dict notes: * Overhead introduced by PR 1080 (https://bugs.python.org/issue30040#msg337778) is cancelled by first insert optimization. It is now faster than before. * PR

[issue30040] new empty dict can be more small

2019-03-13 Thread Inada Naoki
Inada Naoki added the comment: I created two PRs. PR 12307 just optimize inserting to empty dict. PR 12308 has same optimization and removes shared empty key (ma_keys = NULL). I confirmed PR 12307 is faster than before about `d = {}; d["a"]=None`. I'll benchmark them later. --

[issue30040] new empty dict can be more small

2019-03-13 Thread Inada Naoki
Change by Inada Naoki : -- pull_requests: +12284 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue30040] new empty dict can be more small

2019-03-13 Thread Inada Naoki
Change by Inada Naoki : -- keywords: +patch pull_requests: +12283 stage: resolved -> patch review ___ Python tracker ___ ___

[issue30040] new empty dict can be more small

2019-03-12 Thread Inada Naoki
Inada Naoki added the comment: On Wed, Mar 13, 2019 at 4:46 AM Raymond Hettinger wrote: > > Raymond Hettinger added the comment: > > I hate to see you working so hard to make this work. IMO, the effort is > futile. Dicts were intentionally designed to do a single memory allocation > and

[issue30040] new empty dict can be more small

2019-03-12 Thread Terry J. Reedy
Terry J. Reedy added the comment: Some of my thoughts on this. Conceptually, I expected that clearing a normal dict should make it an empty normal dict. I presume that making it instead an empty shared key dict is a matter of efficiency and code simplicity. If the 'anomaly' is to be

[issue30040] new empty dict can be more small

2019-03-12 Thread Mark Shannon
Mark Shannon added the comment: Serhiy, for {'a': None} the dict is created using _PyDict_NewPresized() so this makes no difference. -- ___ Python tracker ___

[issue30040] new empty dict can be more small

2019-03-12 Thread Mark Shannon
Mark Shannon added the comment: In general, I agree with Raymond that this is likely to counter-productive. But let's not guess, let's measure :) I expect that there are few live empty dicts at any time for most programs. In which case there is no point in any change that attempts to save

[issue30040] new empty dict can be more small

2019-03-12 Thread Raymond Hettinger
Raymond Hettinger added the comment: > Hmm, while 2x faster temporal empty dict is nice, > 10% slower case can be mitigated. > I will try more micro benchmarks and optimizations. I hate to see you working so hard to make this work. IMO, the effort is futile. Dicts were intentionally

[issue30040] new empty dict can be more small

2019-03-12 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: What about creating small dict using a dict display? E.g. {'a': None}. -- ___ Python tracker ___

[issue30040] new empty dict can be more small

2019-03-12 Thread STINNER Victor
Change by STINNER Victor : -- nosy: +vstinner ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue30040] new empty dict can be more small

2019-03-12 Thread Inada Naoki
Inada Naoki added the comment: Another micro benchmark: $ ./py.edict.opt -m perf timeit --compare-to ./py.master.opt '{}' --duplicate=10 py.master.opt: . 26.3 ns +- 0.5 ns py.edict.opt: . 13.0 ns +- 0.1 ns Mean +- std dev: [py.master.opt] 26.3 ns +-

[issue30040] new empty dict can be more small

2019-03-12 Thread Inada Naoki
Change by Inada Naoki : -- nosy: +Mark.Shannon ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue30040] new empty dict can be more small

2019-03-12 Thread Inada Naoki
Inada Naoki added the comment: > I don't think this should have been done. Conceptually, there is no basis > for presuming key-sharing for new empty dicts -- you can't know what they > would share with. This patch essentially undoes the entire reason for having a pre-allocated minsize

[issue30040] new empty dict can be more small

2019-03-12 Thread Raymond Hettinger
Change by Raymond Hettinger : -- nosy: +Mark.Shannon ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue30040] new empty dict can be more small

2019-03-12 Thread Raymond Hettinger
Raymond Hettinger added the comment: I don't think this should have been done. Conceptually, there is no basis for presuming key-sharing for new empty dicts -- you can't know what they would share with. This patch essentially undoes the entire reason for having a pre-allocated minsize

[issue30040] new empty dict can be more small

2019-03-12 Thread Inada Naoki
Change by Inada Naoki : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed versions: +Python 3.8 -Python 3.7 ___ Python tracker ___

[issue30040] new empty dict can be more small

2019-03-12 Thread Inada Naoki
Inada Naoki added the comment: New changeset f2a186712bfe726d54723eba37d80c7f0303a50b by Inada Naoki in branch 'master': bpo-30040: new empty dict uses key-sharing dict (GH-1080) https://github.com/python/cpython/commit/f2a186712bfe726d54723eba37d80c7f0303a50b --

[issue30040] new empty dict can be more small

2017-04-17 Thread Josh Rosenberg
Josh Rosenberg added the comment: For the record, legitimate case when many empty dicts are created, and few are populated, is the collections-free approach to defaultdict(dict): mydict.setdefault(key1, {})[key2] = val For, say, 100 unique key1s, and 10,000 total key1/key2 pairs, you'd

[issue30040] new empty dict can be more small

2017-04-17 Thread Louie Lu
Louie Lu added the comment: I'm testing[1] that if we make a fast path that detect if keys is `empty_keys_struct` inside `dictresize`. It can be faster than original patch, but still slower than default (unpatch) in most case. ➜ cpython git:(compact_empty_dict) ✗ ./python.default -m perf

[issue30040] new empty dict can be more small

2017-04-16 Thread Louie Lu
Louie Lu added the comment: forgive my words, I trace the wrong code, sorry about that. -- ___ Python tracker ___

[issue30040] new empty dict can be more small

2017-04-16 Thread Louie Lu
Louie Lu added the comment: Inada's patch version act different inside `PyObject_SetItem`, when running this code: 'x = {}; x['a'] = 123' at PyObject_SetItem, patch version goes to this line: >│179 if (m && m->mp_ass_subscript) │180 return m->mp_ass_subscript(o, key,

[issue30040] new empty dict can be more small

2017-04-14 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I got following result: $ ./python.patched -m perf timeit --compare-to=./python.default --duplicate=100 -- '{}' python.default: . 203 ns +- 5 ns python.patched: . 97.1 ns +- 0.7 ns Mean +- std dev: [python.default]

[issue30040] new empty dict can be more small

2017-04-11 Thread INADA Naoki
INADA Naoki added the comment: > I mean creating a solo empty dict doesn't seem to make much sense. Although > it saves memory, but when it's populated, it's resized and the memory > occupation comes back. But sometimes it's not populated. class A: def __init__(self, **kwargs):

[issue30040] new empty dict can be more small

2017-04-11 Thread R. David Murray
R. David Murray added the comment: Sorry, but I no longer have access to that application (I'm a consultant, and the owner is no longer a client). -- ___ Python tracker

[issue30040] new empty dict can be more small

2017-04-11 Thread INADA Naoki
INADA Naoki added the comment: Thank you for your reply. Would you try to check how the patch [1] affects memory usage of your application? I think the patch can be applied to 3.6 easily. [1] https://patch-diff.githubusercontent.com/raw/python/cpython/pull/1080.patch --

[issue30040] new empty dict can be more small

2017-04-11 Thread R. David Murray
R. David Murray added the comment: I've worked on an application (proprietary, unfortunately) that created a lot of empty dictionaries that only sometimes got populated. It involved sqlalchemy, but I don't remember if the dicts came from sqlalchemy itself or from the code that used it. That

[issue30040] new empty dict can be more small

2017-04-11 Thread INADA Naoki
INADA Naoki added the comment: While I think it's preferable that {} and d.clear() have same memory footprint, I need real world example which empty dict affects overall memory usage. I'll check memory usage difference with application I used in this ML thread.

[issue30040] new empty dict can be more small

2017-04-11 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Use "--duplicate 100" when making microbenchmarks for such fast operations. The overhead of iterating can be significant and comparable with the time of the operation. -- nosy: +serhiy.storchaka stage: -> patch review

[issue30040] new empty dict can be more small

2017-04-11 Thread Xiang Zhang
Xiang Zhang added the comment: I mean creating a solo empty dict doesn't seem to make much sense. Although it saves memory, but when it's populated, it's resized and the memory occupation comes back. And this makes PyDict_New() hard to understand. :-( --

[issue30040] new empty dict can be more small

2017-04-11 Thread INADA Naoki
INADA Naoki added the comment: macro bench result: $ ./python.default -m perf compare_to -G --min-speed=1 default.json patched.json Slower (11): - scimark_lu: 362 ms +- 13 ms -> 383 ms +- 22 ms: 1.06x slower (+6%) - unpickle_pure_python: 882 us +- 18 us -> 924 us +- 18 us: 1.05x slower (+5%) -

[issue30040] new empty dict can be more small

2017-04-11 Thread INADA Naoki
INADA Naoki added the comment: > Isn't the latter case the more common one? Creating an empty dict and then > populate it. This is memory usage optimization, not performance optimization. (But I think memory efficiency makes multi process application faster because L3 cache size is limited

[issue30040] new empty dict can be more small

2017-04-11 Thread Xiang Zhang
Xiang Zhang added the comment: Isn't the latter case the more common one? Creating an empty dict and then populate it. -- nosy: +xiang.zhang ___ Python tracker

[issue30040] new empty dict can be more small

2017-04-11 Thread INADA Naoki
Changes by INADA Naoki : -- components: +Interpreter Core type: -> resource usage versions: +Python 3.7 ___ Python tracker ___

[issue30040] new empty dict can be more small

2017-04-11 Thread INADA Naoki
INADA Naoki added the comment: performance impact best case: $ ./python.patched -m perf timeit --compare-to=`pwd`/python.default -- '{}' python.default: . 36.9 ns +- 0.9 ns python.patched: . 25.3 ns +- 0.7 ns Mean +- std dev: [python.default] 36.9 ns +-

[issue30040] new empty dict can be more small

2017-04-11 Thread INADA Naoki
Changes by INADA Naoki : -- pull_requests: +1223 ___ Python tracker ___ ___

[issue30040] new empty dict can be more small

2017-04-11 Thread INADA Naoki
New submission from INADA Naoki: dict.clear() make the dict to empty key-sharing dict to reduce it's size. New dict can use same technique. $ ./python.default Python 3.7.0a0 (heads/master:6dfcc81, Apr 10 2017, 19:55:52) [GCC 6.2.0 20161005] on linux Type "help", "copyright", "credits" or