After a few days of thinking and experimenting, I’ve been convinced that `copy`
(and also `__copy__`) is not the right protocol for what we want to do here. I
believe that 584 can likely continue without subclass-preserving behavior, but
that better behavior could perhaps could be added to *all* built-in types
later, since it’s outside the scope of this PEP.
> My opinion is that Python built-in types make subclassing unnecessarily(?)
> awkward to use, and I would like to see that change.
Yes! But, on further reflection, I don’t think this is the correct way of
approaching it.
> For example, consider subclassing float. If you want your subclass to be
> actually usable, you have to write a whole bunch of boilerplate, otherwise
> the first time you perform arithmetic on your subclass, it will be converted
> to a regular old float… This is painful and adds a great amount of friction
> to subclassing.
`float` is a *perfect* example of the problems with the way things are
currently, so let’s focus on this.
Currently, subclassing `float` requires ~30 overridden methods of repetitive
(but non-trivial) boilerplate to get everything working right. However, calling
the `float` equivalent of `dict.copy()` on the LHS before proceeding with the
default implementation wouldn’t help us, because floats (like many built-in
types) are immutable. So a new, plain, built-in `float` would still be returned
by the default machinery. It doesn’t know how to construct a new, different
instance of our subclass, and it can’t change one it’s already built.
This leads me to believe that we’re approaching the problem wrong. Rather than
making a copy and working on it, I think the problem would be better served by
a protocol that runs the default implementation, *then* calls some under hook
on the subclass to build a new instance.
Let’s call this method `__build__`. I’m not sure what its arguments would look
like, but it would probably need at least `self`, and an instance of the
built-in base class (in this case a `float`), and return a new instance of the
subclass based on the two. It would likely also need to work with `cls` instead
of `self` for `classmethod` constructors like `dict.fromkeys`, or have a second
hook for that case.
By subclassing `float` and defining `__build__` to something like this:
```
class MyFloat(float):
…
def __build__(self, result):
Return MyFloat(result, some_state=self.some_state)
…
```
I could now trust the built-in `float` machinery to try calling
`lhs.__build__(result)` on the result that *would* have been returned *before*
returning it. This is a simple example, but a protocol like this would work for
mutables as well.
> A more pertinent example, from dict itself:
If `dict` *were* to grow more operators, they would likely be `^`, `&`, and
`-`. You can consider the case of subclassing `set` or `frozenset`, since they
currently has those. Calling `lhs.copy()` first before updating is fine for
additive operations like `|`, but for subtractive operations like the others,
this can be very bad for performance, especially if we’re now *required* to
call them. Again, doing things the default way, and *then* constructing the
appropriate subclass in an agreed-upon way seems like the path to take here.
> Changing all builtins is a big, backwards-incompatible change.
If implemented right, a system like the one described above (`__build__`)
wouldn’t be backward-incompatible, as long as nobody was already using the name.
Just food for thought. I think this is a much bigger issue than PEP 584, but
I'm convinced that the consistent status quo should prevail until a suitable
solution for all types can be worked out (if ever).
_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/7HNJ6RVVEX2VD37HYR5F5P5W2YAOMPDH/
Code of Conduct: http://python.org/psf/codeofconduct/