On Mon, Jun 4, 2018 at 4:02 PM, Yuval Greenfield <ubershme...@gmail.com> wrote: > The proposed meanings surprised me too. My initial instinct for > `dict.append` was that it would always succeed, much like `list.append` > always succeeds.
Many of the methods named `append` in the standard libraries fail if adding the item would violate a constraint of the data structure. `list.append` is an exception because it stores uninterpreted object references, but, e.g., bytearray().append(-1) raises ValueError. Also, `dict.__setitem__` and `dict.update` fail if a key is unhashable, which is another dict-specific constraint. Regardless, I'm not too attached to those names. I want the underlying functionality, and the names made sense to me. I'd be okay with `unique_setitem` and `unique_update`. On Mon, Jun 4, 2018 at 5:25 PM, Steven D'Aprano <st...@pearwood.info> wrote: > On Mon, Jun 04, 2018 at 02:22:29PM -0700, Ben Rudiak-Gould wrote: >> Very often I expect that the key I'm adding to a dict isn't already in >> it. > > Can you give some examples of when you want to do that? I'm having > difficulty in thinking of any. One example (or family of examples) is any situation where you would have a UNIQUE constraint on an indexed column in a database. If the values in a column should always be distinct, like the usernames in a table of user accounts, you can declare that column UNIQUE (or PRIMARY KEY) and any attempt to add a record with a duplicate username will fail. People often use Python dicts to look up objects by some property of the object, which is similar to indexing a database column. When the values aren't necessarily unique (like a zip code), you have to use something like defaultdict(list) for the index, because Python doesn't have a dictionary that supports duplicate keys like C++'s std::multimap. When the values should be unique (like a username), the best data type for the index is dict, but there's no method on dicts that has the desired behavior of refusing to add a record with a duplicate key. I think this is a frequent enough use case to deserve standard library support. Of course you can implement the same functionality in other ways, but that's as true of databases as it is of Python. If SQL didn't have UNIQUE, every client of the database would have its own code for checking and enforcing the constraint. They'd all have different names, and slightly different implementations. The uniqueness property that they're supposed to guarantee would probably be documented only in comments if at all. Some implementations would probably have bugs. You can't offload all of your programming needs onto the database developer, but I think UNIQUE is a useful enough feature to merit inclusion in SQL. And that's my argument for Python as well. Another example is keyword arguments. f(**{'a': 1}, **{'a': 2}) could mean f(a=1) or f(a=2), but actually it's an error. I think that was a good design decision: it's consistent with Python's general philosophy of raising exceptions when things look dodgy, which makes it much easier to find bugs. Compare this to JavaScript, where if you pass four arguments to a function that expected three, the fourth is just discarded; and if the actual incorrect argument was the first, not the fourth, then all of the arguments will be bound to the wrong variables, and if an argument that was supposed to be a number gets a value of some other type as a consequence, and the function tries to add 1 to it, it still won't fail but will produce some silly result like "[object Object]1", which will then propagate through more of the code, until finally you get a wrong answer or a failure in code that's unrelated to the actually erroneous code. I'm thankful that Python doesn't do that, and I wish it didn't do it even more than it already doesn't. Methods that raise an exception on duplicated keys, instead of silently discarding the old or new value, are an example of the sort of fail-safe operations that I'd like to see more of. For overridable options with defaults, `__setitem__` and `update` do the right thing - I certainly don't think they're useless. > I'm sorry, I can't visualise how it would take you up to five lines to > check and update a key. It shouldn't take more than two: > > if key not in d: > d[key] = value > > Can you give an real-life example of the five line version? The three lines I was thinking of were something like if k in d: raise KeyError(k) d[k] = ... The five lines were something like d = get_mapping() k = get_key() if k in d: raise KeyError(k) d[k] = ... as a checked version of get_mapping()[get_key()] = ... (or in general, any situation where you can't or don't want to duplicate the expressions that produce the mapping and the key). > I don't see any connection between "append" and "fail if the key already > exists". That's not what it means with lists. If Python had built-in dictionaries with no unique-key constraint, and you started with multidict({'a': 1, 'b': 2}) and appended 'a': 3 to that, you'd get multidict({'a': 1, 'b': 2, 'a': 3}) just as if you'd appended ('a', 3) to [('a', 1), ('b', 2)], except that this "list" is indexed on the first half of each element. If you try to append 'a': 3 to the actual Python dict {'a': 1, 'b': 2}, it should fail because {'a': 1, 'b': 2, 'a': 3} violates the unique-key constraint of that data structure. The failure isn't the point, as such. It just means the method can't do what it's meant to do, which is add something to the dict while leaving everything that's already there alone. -- Ben _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/