Yes. Values can be mutable, as tuple. A "pure" immutable dict is a frozendict with only immutable values (strings, tuples of numbers etc). In this case you also have a hash.
frozendict apart, I think some of my "tricks" could be applied to dict, if the implementation sounds and there's a real benefit. I'm not a C expert and my knowledge of the Python C API is very limited. But, if you core devs think it's worth it, I could try to submit some patch in CPython trunk. > one problem with frozen dicts in the past is deciding exactly what is mutable and what is immutable. Well, I had the same problem... Indeed it seems there's no way to use the duck typing to understand if a type is mutable or not. I suppose this can only be done using a convention, or a contract. If we know that an object is immutable, you can adopt some cache and speedup strategies. On Wed, 22 Jul 2020 at 06:53, Guido van Rossum <gu...@python.org> wrote: > That seems pretty clear — presumably it follows the lead of frozenset and > tuple. > > On Tue, Jul 21, 2020 at 21:45 Todd <toddr...@gmail.com> wrote: > >> What, exactly, is frozen? My understanding is that one problem with >> frozen dicts in the past is deciding exactly what is mutable and what is >> immutable. Can you change what object a key maps to so long as the set of >> keys stay the same? Can you modify the contents of mutable object that is >> a value? >> >> On Tue, Jul 21, 2020 at 6:30 PM Marco Sulla <marco.sulla.pyt...@gmail.com> >> wrote: >> >>> Let me first say that the code and discussion, as per title, is also >>> about possible performance improvements to the dict base type. >>> >>> TL;DR I implemented a frozendict using CPython 3.9 code. It seems that >>> an immutable dict *could* be faster than dict in some cases. Furthermore, >>> some optimization could be applied to dict too. >>> >>> Long explaining: >>> >>> Since now I have some time, I decided to experiment a little with the >>> creation of an immutable dict in Python. >>> >>> Unluckily, I started this experiment many months ago, so the CPython >>> code I used is old. Maybe some or all of my considerations are outdated. >>> >>> Initially, I wrote a quick and dirty implementation: >>> >>> https://github.com/Marco-Sulla/cpython/commit/fde4e6d236b19636063f8afedea8c50278205334 >>> >>> The code was very simple, and the performance was identical to dict. So >>> in theory, adding a frozendict to CPython is not hard. But there's more. >>> >>> After the first implementation, I started to try to improve the >>> performance of frozendict. The result of the improvements are here: >>> >>> >>> https://github.com/Marco-Sulla/cpython/blob/master/frozendict/test/bench.txt >>> >>> For benchmarks, I used simply timeit, with autorange and repeat and, as >>> suggested in the module documentation, I got the minimum of the results. >>> Here is the code: >>> >>> >>> https://github.com/Marco-Sulla/cpython/blob/master/frozendict/test/bench.py >>> >>> I have not tested with an optimized build, since optimization data is >>> collected using the unit tests, and I didn't write tests for frozendict in >>> the official CPython test framework. >>> >>> The tests and benchmarks were done using CPython 3.9a0. CPU and other pc >>> resources were not restricted using pyperf or similar tools, to see the >>> "real" speed. CPython was compiled using gcc and g++. >>> >>> In benchmarks, I compared methods and operators using dict and >>> frozendict. The benchmark uses dicts with all integers and dicts with all >>> strings. Furthermore, I tested dicts with size 8 (the most optimized size >>> in CPython) and 1000 elements (maybe too much, but I wanted to see how they >>> perform with a high number of elements). >>> >>> Every benchmark has a line in the output. The Name describes the >>> benchmarked method, operator or code snippet. Size is the size of the dict, >>> 8 or 1000. Keys are the keys type, str or int. Type is the dictionary type, >>> dict or frozendict. >>> >>> In Name, the "o" represents the object itself (dict or frozendict). "d", >>> in benchmark with dict, is "o"; in benchmarks with frozendict is an >>> equivalent instance of type dict. >>> >>> Some consideration: >>> >>> 1. frozendict is very fast, as expected, at copying. But it's also >>> faster at creation, using a (not frozen) dict, kwargs or a sequence2. >>> Speedups range from 20% to 45%. >>> 2. frozendict is also a bit faster when you iterate over it, especially >>> over values, where is ~15% faster >>> 3. hash seems really fast, but this is because it's cached the first >>> time hash() is invoked >>> 4. where frozendict is slower is when you unpickle it and with fromkeys >>> and repr. This is because I wrote a very naif implementation of these >>> methods, without optimizing them. The other methods have a comparable speed. >>> >>> Here is the code: >>> https://github.com/Marco-Sulla/cpython >>> >>> Here is the diff between the CPython code and my commits: >>> https://github.com/python/cpython/compare/master...Marco-Sulla:master >>> >>> About code >>> >>> I coded the implementation doing a simple copy/paste of the existing >>> dict functions, modifying their code and renaming them. This way I'm sure >>> dict continues to work as before, and I can compare the speed gain. >>> >>> Some of the optimization I adopted can be implemented in `dict` too. For >>> example, instead of creating an empty dict and resizing it, I create it >>> with the "maximum" size and I fill it. It seems to work, even if I did not >>> explore the possibility that a mutable object can change while a frozendict >>> creation is in progress. >>> >>> Some problems with optimizing dict and maintaining a frozendict: >>> >>> 1. duplication of code. To gain a significant boost, I had to copy and >>> paste a lot of code. Functions can be remerged again, but maybe the speedup >>> will be reduced. >>> 2. split vs combined dicts. As I wrote, split dicts seem to be faster in >>> reading than combined dicts. For example, iterating over values is faster >>> with a split dict, as expected. >>> But writing was not tested; furthermore, some of the optimizations can >>> be adopted for dicts too, so the convenience of a split table can be >>> lowered. >>> dict continues to maintain both split and combined tables, so this could >>> be not a problem. But the code could be less and more fast if only a table >>> layout is supported >>> 3. the CPython code I used is old, so some of the improvements I adopted >>> could be already implemented >>> >>> About frozendict >>> >>> Apart the considerations done in the [PEP 416]( >>> https://www.python.org/dev/peps/pep-0416/), that was rejected since >>> there was little gain from its implementation, I think that frozendict can >>> be useful as a substitute of MappingProxyType, that is really slow. >>> MappingProxyType is not much used, but it's present in critical parts of >>> CPython code, for example in _sre. We have to see if a mapping proxy type >>> *can* be substituted with an immutable map in some critical part of CPython >>> code. >>> >>> Furthermore, frozendicts could be used for implementing "immutable" >>> classes and modules, and can be used as a faster dict if its content does >>> not change. >>> _______________________________________________ >>> Python-ideas mailing list -- python-ideas@python.org >>> To unsubscribe send an email to python-ideas-le...@python.org >>> https://mail.python.org/mailman3/lists/python-ideas.python.org/ >>> Message archived at >>> https://mail.python.org/archives/list/python-ideas@python.org/message/K7CRVW6O7RO6DT3JIG3OAJCAVCA5CNTN/ >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> _______________________________________________ >> Python-ideas mailing list -- python-ideas@python.org >> To unsubscribe send an email to python-ideas-le...@python.org >> https://mail.python.org/mailman3/lists/python-ideas.python.org/ >> Message archived at >> https://mail.python.org/archives/list/python-ideas@python.org/message/GAJHZEBOU5BGYJAUNQP5V66KSS6HHSQD/ >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > -- > --Guido (mobile) > _______________________________________________ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/YPQRFJIHHQQAVYN5AZ53HONUJJRKEV5A/ > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KG737YMR3OC7KMYFAUWXG2P6MQ4HY62E/ Code of Conduct: http://python.org/psf/codeofconduct/