[Python-ideas] Overloading assignment concrete proposal (Re: Re: Operator as first class citizens -- like in scala -- or yet another new operator?)

Andrew Barnert via Python-ideas Wed, 19 Jun 2019 15:12:44 -0700

 The thread on operators as first-class citizens keeps getting vague ideas 
about assignment overloading that wouldn't actually work, or don't even make 
sense. I think it's worth writing down the simplest design that would actually 
work, so people can see why it's not a good idea (or explain why they think it 
would be anyway).


in pseudocode, just as x += y means this:
    xval = globals()['x']    try:        result = xval.__iadd__(y)    except 
AttributeError:        result = xval.__add__(y)    globals()['x'] = result
… x = y would mean this:
    try:
        xval = globals()['x']        result = xval.__iassign__(y)
    except (LookupErrorr, AttributeError):        result = y    globals()['x'] 
= result
If you don't understand why this would work or why it wouldn't be a great idea 
(or want to nitpick details), read on; otherwise, you can skip the rest of this 
message.
---
First, why is there even a problem? Because Python doesn't even have 
"variables" in the same sense that languages like C++ that allow assignment 
overloading do.
In C++, a variable is an "lvalue", a location with identity and type, and an 
object is just a value that lives in a location. So assignment is an operation 
on variables: x = 2 is the same as XClass::operator=(&x, y).
In Python, an object is a value that lives wherever it wants, with identity and 
type, and a variable is just a name that can be bound to a value in a 
namespace. So assignment is an operation on namespaces, not on variables: x = 2 
is the same as dict.__settem__(globals(), 'x', 2).
The same thing is true for more complicated assignments. For example, a.x = 2 
is just an operation on a's namespace instead of the global namespace: 
type(a).__setattr__(a, 'x', 2). Likewise, a.b['x'] = 2 is 
type(a.b).__setitem__(a.b, 'x', 2), And so on,
---

But Python allows overloading augmented assignment. How does that work? There's 
a perfectly normal namespace lookup at the start and namespace store at the 
end—but in between, the existing value of the target gets to specify the value 
being assigned.
Immutable types like int don't define __iadd__, and __add__ creates and returns 
a new object. So, x += y ends up the same as x = x + y.
But mutable types like list define an __iadd__ that mutates self in-place and 
then returns self, so x gets harmlessly rebound to the same object it was 
already bound to. So x += y ends up the same as x.extend(y); x = x.
The exact same technique would work for overloading normal assignment. The only 
difference is that x += y is illegal if x is unbound, while x = y obviously has 
to be legal (and mean there is no value to intercept the assignment). So, the 
fallback happens when xval doesn't define __iassign__, but also when x isn't 
bound at all.

So, for immutable types like eint, and almost all mutable types like list—and 
when x is unbound—x = y does the same thing it always did.
But special types that want to act like transparent mutable handles define an 
__iassign__ that mutates self in place and returns self, so x gets harmlessly 
rebound to the same object. So x = y ends up the same as, say, x.set_target(y); 
x = x.
This all works the same if the variables are local rather than global, or for 
more complicated targets like attribution or subscription, and even for target 
lists; the intercept still happens the same way, between the (more complicated) 
lookup and storage steps.
---
Now, why is this a bad idea?
First, the benefit of __iassign__ is a lot smaller than __iadd__. A sizable 
fraction of "x += y" statements are for mutable "x" values, but only a rare 
handful of "x = y" statements would be for special handle "x" values. Even the 
same cost for a much smaller benefit would be a much harder sell.
But the runtime performance cost difference is huge. If augmented assignment 
weren't overloadable, it would still have to lookup the value, lookup and call 
a special method on it, and store the value. The only cost overloading adds is 
trying two special methods instead of one, which is tiny. But regular 
assignment doesn't have to do a value lookup or a special method call at all, 
only a store; adding those steps would roughly double the cost of every new 
variable assignment, and even more for every reassignment. And assignments are 
very common in Python, even within inner loops, so we're talking about a huge 
slowdown to almost every program out there.

Also, the fact that assignment always means assignment makes Python code easier 
both for humans to skim, and for automated programs to process. Consider, for 
example, a static type checker like mypy. Today, x = 2 means that x must now be 
an int, always. But if x could be a Signal object with an overloaded 
__iassign__, then, x = 2 might mean that x must now be an int, or it might mean 
that x must now be whatever type(x).__iassign__ returns.
Finally, the complexity of __iassign__ is at least a little higher than 
__iadd__. Notice that in my pseudocode above, I cheated—obviously the xval =  
and result = lines are not supposed to recursively call the same pseudocode, 
but to directly store a value in new temporary local variable. In the real 
implementation, there wouldn't even be such a temporary variable (in CPython, 
the values would just be pushed on the stack), but for documenting the 
behavior, teaching it to students, etc., that doesn't matter. Being precise 
here wouldn't be hugely difficult, but it is a little more difficult than with 
__iadd__, where there's no similar potential confusion even possible.    On 
Wednesday, June 19, 2019, 10:54:04 AM PDT, Andrew Barnert via Python-ideas 
<[email protected]> wrote:  
 
 On Jun 18, 2019, at 12:43, nate lust <[email protected]> wrote:


I have been following this discussion for a long time, and coincidentally I 
recently started working on a project that could make use of assignment 
overloading. (As an aside it is a configuration system for a astronomical data 
analysis pipeline that makes heavy use of descriptors to work around historical 
decisions and backward compatibility). Our system makes use of nested chains of 
objects and descriptors and proxy object to manage where state is actually 
stored. The whole system could collapse down nicely if there were assignment 
overloading. However, this works OK most of the time, but sometimes at the end 
of the chain things can become quite complicated. I was new to this code base 
and tasked with making some additions to it, and wished for an assignment 
operator, but knew the data binding model of python was incompatible from p.
This got me thinking. I didnt actually need to overload assignment per-say, 
data binding could stay just how it was, but if there was a magic method that 
worked similar to how __get__ works for descriptors but would be called on any 
variable lookup (if the method was defined) it would allow for something akin 
to assignment. 

What counts as “variable lookup”? In particular:

For example:
class Foo:    def __init__(self):        self.value = 6        self.myself = 
weakref.ref(self)    def important_work(self):        print(self.value)

… why doesn’t every one of those “self” lookups call self.__get_self__()? It’s 
a local variable being looked up by name, just like your “foo” below, and it 
finds the same value, which has the same __get_self__ method on its type.
The only viable answer seems to that it does. So, to avoid infinite 
circularity, your class needs to use the same kind of workaround used for 
attribute lookup in classes that define __getattribute__ and/or __setattr__:

    def important_work(self):        print(object.__get_self__(self).value)

    def __get_self__(self):        return object.__get_self__(self).myself

But even that won’t work here, because you still have to look up self to call 
the superclass method on it. I think it would require some new syntax, or at 
least something horrible involving locals(), to allow you to write the 
appropriate methods.

    def __get_self__(self):        return self.myself

Besides recursively calling itself for that “self” lookup, why doesn’t this 
also call weakref.ref.__get_self__ for that “myself” lookup? It’s an attribute 
lookup rather than a local namespace lookup, but surely you need that to work 
too, or as soon as you store a Foo instance in another object it stops 
overloading.
For this case there’s at least an obvious answer: because weakref.ref doesn’t 
override that method, the variable lookup doesn’t get intercepted. But notice 
that this means every single value access in Python now has to do an extra 
special-method lookup that almost always does nothing, which is going to be 
very expensive.

    def __setattr__(self, name, value):        self.value = value

You can’t write __setattr__ methods this way. That assignment statement just 
calls self.__setattr__(‘value’, value), which will endlessly recurse. That’s 
why you need something like the object method call to break the circularity.
Also, this will take over the attribute assignments in your __init__ method. 
And, because it ignores the name and always sets the value attribute, it means 
that self.myself = is just going to override value rather than setting myself.
To solve both of these problems, you want a standard __setattr__ body here:

    def __setattr__(self, name, value):        object.__setattr__(self, name, 
value)

But that immediately makes it obvious that your __setattr__ isn’t actually 
doing anything, and could just be left out entirely.

foo = Foo() # Create an instancefoo # The interpreter would return 
foo.myselffoo.value # The interpreter would return foo.myself.value

foo = 19 # The interpreter would run foo.myself = 6 which would invoke 
foo.__setattr__('myself', 19)

For this last one, why would it do that? There’s no lookup here at all, only an 
assignment.
The only way to make this work would be for the interpreter to lookup the 
current value of the target on every assignment before assigning to it, so that 
lookup could be overloaded. If that were doable, then assignment would already 
be overloadable, and this whole discussion wouldn’t exist.
But, even if you did add that, __get_self__ is just returning the value 
self.myself, not some kind of reference to it. How can the interpreter figure 
out that the weakref.ref value it got came from looking up the name “myself” on 
the Foo instance? (This is the same reason __getattr__ can’t help you override 
attribute setting, and a separate method __setattr__ is needed.) To make this 
work, you’d need a __set_self__ to go along with __get_self__. Otherwise, your 
changes not only don’t provide a way to do assignment overloading, they’d break 
assignment overloading if it existed.
Also, all of the extra stuff you’re trying to add on top of assignment 
overloading can already be done today. You just want a transparent proxy: a 
class whose instances act like a reference to some other object, and delegate 
all methods (and maybe attribute lookups and assignments) to it. This is 
already pretty easy; you can define __getattr__ (and __setattr__) to do it 
dynamically, or you can do some clever stuff to create static delegating 
methods (and properties) explicitly at object-creation or class-creation time. 
Then foo.value returns foo.myself.value, foo.important_work() calls the Foo 
method but foo.__str__() calls foo.myself.__str__(), you can even make it pass 
isinstance checks if you want. The only thing it can’t do is overload 
assignment.
I think the real problem here is that you’re thinking about references to 
variables rather than values, and overloading operators on variables rather 
than values, and neither of those makes sense in Python. Looking up, or 
assigning to, a local variable named “foo” is not an operation on “the foo 
variable”, because there is no such thing; it’s an operation on the locals 
namespace._______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/4JMNZE4S5RPAMRGLJOUHNTCXCOUXVEBO/
Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/MA74LMTPQECXX3JQXATXSR5TZCZX3UEF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Overloading assignment concrete proposal (Re: Re: Operator as first class citizens -- like in scala -- or yet another new operator?)

Reply via email to