On 6 November 2017 at 21:18, Steve Holden <st...@holdenweb.com> wrote: > I have to agree: I find the elevation of a CPython implementation detail to > a language feature somewhat hard to comprehend. Maybe it's more to do with > the way it's been presented, but this is hardly an enhancement the language > has been screaming for for years. > > Presumably there is little concern that algorithms that rely on this > behaviour will be perfectly syntactically conformant with earlier versions > but will fail subtly and without explanation? It's a small concern, but a > real one - particularly for learners.
A similar concern existed when we elevated sort stability to being a language requirement - if you relied on that guarantee, your code was technically buggy on versions prior to 2.3, but eventually 2.2 and earlier aged out of general use, allowing such code to become correct in general. So the current discussion is mainly about deciding where we want the compatibility burden to fall in relation to dict insertion ordering: 1. Do we deliberately revert CPython back to being harder to use correctly for the sake of making Python easier to implement? 2. Do we make Python harder to implement for the sake of making it easier to use? 3. Do we choose not to choose, thus implicitly choosing "2" by default due to the fact that Python is defined by a language spec and a reference implementation, rather than *just* a language spec? Here's a more-complicated-than-a-doctest-for-a-dict-repo, but still fairly straightforward, example regarding the "insertion ordering dictionaries are easier to use correctly" argument: import json data = {"a":1, "b":2, "c":3} rendered = json.dumps(data) data2 = json.loads(rendered) rendered2 = json.dumps(data2) # JSON round trip assert data == data2, "JSON round trip failed" # Dict round trip assert rendered == rendered2, "dict round trip failed" Both of those assertions will always pass in CPython 3.6, as well as in PyPy, because their dict implementations are insertion ordered, which means the iteration order on the dictionaries is always "a", "b", "c". If you try it on 3.5 though, you should fairly consistently see that last assertion fail, since there's nothing in 3.5 that ensures that data and data2 will iterate over their keys in the same order. You can make that code implementation independent (and sufficiently version dependent to pass both assertions) by using OrderedDict: from collections import OrderedDict import json data = OrderedDict(a=1, b=2, c=3) rendered = json.dumps(data) data2 = json.loads(rendered, object_pairs_hook=OrderedDict) rendered2 = json.dumps(data2) # JSON round trip assert data == data2, "JSON round trip failed" # Dict round trip assert rendered == rendered2, "dict round trip failed" However, despite the way this code looks, the serialised key order *might not* be "a, b, c" on 3.5 and earlier (it will be on 3.6+, since that already requires that kwarg order be preserved). So the formally correct version independent code that reliably ensures that the key order in the JSON file is always "a, b, c" looks like this: from collections import OrderedDict import json data = OrderedDict((("a",1), ("b",2), ("c",3))) rendered = json.dumps(data) data2 = json.loads(rendered, object_pairs_hook=OrderedDict) rendered2 = json.dumps(data2) # JSON round trip assert data == data2, "JSON round trip failed" # Dict round trip assert rendered == rendered2, "dict round trip failed" # Key order assert "".join(data) == "".join(data2) == "abc", "key order failed" Getting from the "Works on CPython 3.6+ but is technically non-portable" state to a fully portable correct implementation that ensures a particular key order in the JSON file thus currently requires the following changes: - don't use a dict display, use collections.OrderedDict - make sure to set object_pairs_hook when using json.loads - don't use kwargs to OrderedDict, use a sequence of 2-tuples For 3.6, we've already said that we want the last constraint to age out, such that the middle version of the code also ensures a particular key order. The proposal is that in 3.7 we retroactively declare that the first, most obvious, version of this code should in fact reliably pass all three assertions. Failing that, the proposal is that we instead change the dict iteration implementation such that the dict round trip will start failing reasonably consistently again (the same as it did in 3.5), so that folks realise almost immediately that they still need collections.OrderedDict instead of the builtin dict. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com