Re: [Python-ideas] PEP: Dict addition and subtraction

2019-03-02 Thread Brandt Bucher
Hi Steven.

Thanks for the clarifications. I've pushed a complete working patch (with
tests) to GitHub. It's linked to the bpo issue.

Branch: https://github.com/brandtbucher/cpython/tree/addiction
PR: https://github.com/python/cpython/pull/12088

Right now, it's pretty much a straight reimplementation of your Python
examples. I plan to update it periodically to keep it in sync with any
changes, and to make a few optimizations (for example, when operands are
identical or empty).

Let me know if you have any questions/suggestions. Stoked to learn and help
out with this process! :)

Brandt

On Fri, Mar 1, 2019 at 7:57 PM Steven D'Aprano  wrote:

> Executive summary:
>
> - I'm going to argue for subclass-preserving behaviour;
>
> - I'm not wedded to the idea that dict += should actually call the
> update method, so long as it has the same behaviour;
>
> - __iadd__ has no need to return NotImplemented or type-check its
> argument.
>
> Details below.
>
>
> On Fri, Mar 01, 2019 at 04:10:44PM -0800, Brandt Bucher wrote:
>
> [...]
> > In your Python implementation samples from the PEP, dict subclasses will
> > behave differently from how list subclasses do. List subclasses, without
> > overrides, return *list* objects for bare "+" operations
>
> Right -- and I think they are wrong to do so, for reasons I explained
> here:
>
> https://mail.python.org/pipermail/python-ideas/2019-March/055547.html
>
> I think the standard handling of subclasses in Python builtins is wrong,
> and I don't wish to emulate that wrong behaviour without a really good
> reason. Or at least a better reason than "other methods break
> subclassing unless explicitly overloaded, so this should do so too".
>
> Or at least not without a fight :-)
>
>
>
> > (and "+=" won't call an overridden "extend" method).
>
> I'm slightly less opinionated about that. Looking more closely into the
> docs, I see that they don't actually say that += calls list.extend:
>
> s.extend(t)   extends s with the contents of t (for
> or s += t the most part the same as s[len(s):len(s)] = t)
>
> https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types
>
> only that they have the same effect. So the wording re lists calling
> extend certainly needs to be changed. But that doesn't mean that we must
> change the implementation. We have a choice:
>
> - regardless of what lists do, we define += for dicts as literally
>   calling dict.update; the more I think about it, the less I like this.
>
> - Or we say that += behaves similarly to update, without actually
>   calling the method. I think I prefer this.
>
> (The second implies either that += either contains a duplicate of the
> update logic, or that += and update both delegate to a private, C-level
> function that does most of the work.)
>
> I think that the second approach (define += as having the equivalent
> semantics of update but without actually calling the update method) is
> probably better. That decouples the two methods, allows subclasses to
> change one without necessarily changing the other.
>
>
> > So a more analogous
> > pseudo-implementation (if that's what we seek) would look like:
> >
> > def __add__(self, other):
> > if isinstance(other, dict):
> > new = dict.copy(self)
> > dict.update(new, other)
> > return new
> > return NotImplemented
>
> We should not require the copy method.
>
> The PEP should be more explicit that the approximate implementation does
> not imply the copy() and update() methods are actually called.
>
>
> > def __iadd__(self, other):
> > if isinstance(other, dict):
> > dict.update(self, other)
> > return self
> > return NotImplemented
>
> I don't agree with that implementation.
>
> According to PEP 203, which introduced augmented assignment, the
> sequence of calls in ``d += e`` is:
>
> 1. Try to call ``d.__iadd__(e)``.
>
> 2. If __iadd__ is not present, try ``d.__add__(e)``.
>
> 3. If __add__ is missing too, try ``e.__radd__(d)``.
>
> but my tests suggest this is inaccurate. I think the correct behaviour
> is this:
>
> 1. Try to call ``d.__iadd__(e)``.
>
> 2. If __iadd__ is not present, or if it returns NotImplemented,
>try ``d.__add__(e)``.
>
> 3. If __add__ is missing too, or if it returns NotImplemented,
>fail with TypeError.
>
> In other words, e.__radd__ is not used.
>
> We don't want dict.__iadd__ to try calling __add__, since the later is
> more restrictive and less efficient than the in-place merge. So there is
> no need for __iadd__ to return NotImplemented. It should either succeed
> on its own, or fail hard:
>
> def __iadd__(self, other):
> self.update(other)
> return self
>
> Except that the actual C implementation won't call the update method
> itself, but will follow the same semantics.
>
> See the docstring for dict.update for details of what is accepted by
> update.
>
>
> --
> Steven
> ___
> Python-ideas mailing list
> 

Re: [Python-ideas] Dict joining using + and +=

2019-03-02 Thread MRAB

On 2019-03-02 22:02, francismb wrote:


On 2/27/19 7:14 PM, MRAB wrote:

Are there any advantages of using '+' over '|'?

or for e.g. '<=' (d1 <= d2) over '+' (d1 + d2)

'<=' is for comparison, less-than-or-equal (in the case of sets, subset, 
which is sort of the same kind of thing). Using it for anything else in 
Python would be too confusing.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Dict joining using + and +=

2019-03-02 Thread francismb


On 2/27/19 7:14 PM, MRAB wrote:
> Are there any advantages of using '+' over '|'?
or for e.g. '<=' (d1 <= d2) over '+' (d1 + d2)

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Dict joining using + and +=

2019-03-02 Thread francismb


On 3/2/19 8:14 PM, Raymond Hettinger wrote:
> Lastly, I'm still bugged by use of the + operator for replace-logic instead 
> of additive-logic.  With numbers and lists and Counters, the plus operator 
> creates a new object where all the contents of each operand contribute to the 
> result.  With dicts, some of the contents for the left operand get 
> thrown-away.  This doesn't seem like addition to me (IIRC that is also why 
> sets have "|" instead of "+").
+1, it's a good point. IMHO the proposed (meaning) overloading for + and
+= is too much/unclear. If the idea is to 'join' dicts why not to use
"d.join(...here the other dicts ...)"

Regards,
--francis
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Dict joining using + and +=

2019-03-02 Thread Raymond Hettinger


> On Mar 1, 2019, at 11:31 AM, Guido van Rossum  wrote:
> 
> There's a compromise solution for this possible. We already do this for 
> Sequence and MutableSequence: Sequence does *not* define __add__, but 
> MutableSequence *does* define __iadd__, and the default implementation just 
> calls self.update(other). I propose the same for Mapping (do nothing) and 
> MutableMapping: make the default __iadd__ implementation call 
> self.update(other).

Usually, it's easy to add methods to classes without creating disruption, but 
ABCs are more problematic.  If MutableMapping grows an __iadd__() method, what 
would that mean for existing classes that register as MutableMapping but don't 
already implement __iadd__?  When "isinstance(m, MutableMapping)" returns True, 
is it a promise that the API is fully implemented? Is this something that mypy 
could would or should complain about?

> Anyways, the main reason to prefer d1+d2 over {**d1, **d2} is that the latter 
> is highly non-obvious except if you've already encountered that pattern before

I concur.  The latter is also an eyesore and almost certain to be a stumbling 
block when reading code.

That said, I'm not sure we actually need a short-cut for "d=e.copy(); 
d.update(f)".  Code like this comes-up for me perhaps once a year.  Having a 
plus operator on dicts would likely save me five seconds per year.

If the existing code were in the form of "d=e.copy(); d.update(f); d.update(g); 
d.update(h)", converting it to "d = e + f + g + h" would be a tempting but 
algorithmically poor thing to do (because the behavior is quadratic).  Most 
likely, the right thing to do would be "d = ChainMap(e, f, g, h)" for a 
zero-copy solution or "d = dict(ChainMap(e, f, g, h))" to flatten the result 
without incurring quadratic costs.  Both of those are short and clear.

Lastly, I'm still bugged by use of the + operator for replace-logic instead of 
additive-logic.  With numbers and lists and Counters, the plus operator creates 
a new object where all the contents of each operand contribute to the result.  
With dicts, some of the contents for the left operand get thrown-away.  This 
doesn't seem like addition to me (IIRC that is also why sets have "|" instead 
of "+").
 

Raymond


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/