On Sat, Mar 16, 2019 at 5:02 AM Gustavo Carneiro <gjcarne...@gmail.com>
wrote:

> On Sat, 16 Mar 2019 at 10:33, Steven D'Aprano <st...@pearwood.info> wrote:
>
>> On Fri, Mar 15, 2019 at 10:53:31PM +0000, MRAB wrote:
>>
>> > There was also the suggestion of having both << and >>.
>> >
>> > Actually, now that dicts are ordered, that would provide a use-case,
>> > because you would then be able to choose which values were overwritten
>> > whilst maintaining the order of the dict on the LHS.
>>
>> Is that common enough that it needs to be built-in to dict itself?
>>
>> If it is uncommon, then the conventional solution is to subclass dict,
>> overriding the merge operator to use first-seen semantics.
>>
>> The question this PEP is trying to answer is not "can we support every
>> use-case imaginable for a merge operator?" but "can we support the most
>> typical use-case?", which I believe is a version of:
>>
>>     new = a.copy()
>>     new.update(b)
>>     # do something with new
>>
>
> Already been said, but might have been forgotten, but the new proposed
> syntax:
>
>     new = a + b
>
> has to compete with the already existing syntax:
>
>     new = {**a, **b}
>
> The existing syntax is not exactly an operator in the mathematical sense
> (or is it?...), but my intuition is that it already triggers the visual
> processing part of the brain, similarly to operators.
>
> The only argument for "a + b" in detriment of "{**a, **b}" is that "a + b"
> is more easy to discover, while not many programmers are familiar with
> "{**a, **b}".
>
> I wonder if this is only a matter of time, and over time programmers will
> become more accustomed to "{**a, **b}", thereby reducing the relative
> benefit of  "a + b"?  Especially as more and more developers migrate code
> bases from Python 2 to Python 3...
>

FWIW, even as a core developer I had forgotten that the {**a, **b} syntax
existed, thanks for the reminder! :)  But that's more likely because I
rarely write code that needs to update and merge a dict or when i do it's
still 2and3 compatible.

Antoine said:

> If "+" is added to dicts, then we're overloading an already heavily used
operator.  It makes reading code more difficult.

This really resonated with me.  Reading code gives you a feel for what
possible types something could be.  The set of possibilities for + is
admittedly already quite large in Python.  But making an existing core type
start supporting + *reduces the information given to the reader* by that
one line of code.  They now have more possibilities to consider and must
seek hints from more surrounding code.

For type inferencers such as us humans <https://en.wikipedia.org/wiki/Human>
or tools like pytype <https://github.com/google/pytype>, it means we need
to consider which version Python's dict the code may be running under in
order to infer what it may mean from the code's context.  For tooling,
that's just a flag and a matter of conditionally changing code defining
dict, but for humans they need to carry the possibility of that flag with
them everywhere.

We should just promote the use of {**d1, **d2} syntax for anyone who wants
an inline updated copy.

Why?

(1) It already exists.  (insert zen of python quote here)

(2) Copying via the + operator encourages inefficient code (already true
for bytes/str/list).  A single + is fine.  But the natural human extension
to that when people want to merge a bunch of things is to use a string of
multiple operators, because that is how we're taught math.  No matter what
type we're talking about, in Python this is an efficiency antipattern.

 z = a + b + c + d + e

That's four __add__ calls.  Each of which is a copy+update/extend/append
operation for a dict/list/str respectively.

We already tell people not to do that with bytes and lists, instead using
b''.join(a,b,c,d) or z = []; z.extend(X)... calls or something from
itertools.

Given dict addition, it'd always be more efficient to join the "don't use
tons of  + operators" club (a good lint warning) and write that as
 z = {**a, **b, **c, **d, **e}.

Unless the copy+update concept is an extremely common operation, having
more than one way to do it feels like it'll cause more cognitive harm than
good.

Now (2) could *also* be used as an argument that Python should detect
chains of operators and allow those to be optimized.  That'd be a PEP of
its own and is complicated to do; technically a semantic change given how
dynamic Python is, as we do name lookups at time of use and each __add__
call could potentially have side effects changing the results of future
name lookups (the a+b could change the meaning of c).  Yes, that is
horrible and people writing code that does that deserve very bad things,
but those are the semantics we'd be breaking if we tried to detect and
support a mythical new construct like __chained_add__ being invoked when
the types of all elements being added sequentially are identical (how
identical? do subtypes count? see, complicated).

a + b + c:
       0 LOAD_GLOBAL              0 (a)
       2 LOAD_GLOBAL              1 (b)
       4 BINARY_ADD
       6 LOAD_GLOBAL              2 (c)
       8 BINARY_ADD

-gps
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to