Re: [PHP-DEV] [Pre-RFC Discussion] User Defined Operator Overloads (again)

Jordan LeDoux Tue, 17 Sep 2024 11:14:23 -0700

On Tue, Sep 17, 2024 at 10:55 AM Davey Shafik <m...@daveyshafik.com> wrote:


>
>
> On Sep 17, 2024, at 10:15, Jordan LeDoux <jordan.led...@gmail.com> wrote:
>
>
>
> On Tue, Sep 17, 2024 at 1:18 AM Rowan Tommins [IMSoP] <
> imsop....@rwec.co.uk> wrote:
>
>> On 14/09/2024 22:48, Jordan LeDoux wrote:
>> >
>> > 1. Should the next version of this RFC use the `operator` keyword, or
>> > should that approach be abandoned for something more familiar? Why do
>> > you feel that way?
>> >
>> > 2. Should the capability to overload comparison operators be provided
>> > in the same RFC, or would it be better to separate that into its own
>> > RFC? Why do you feel that way?
>> >
>> > 3. Do you feel there were any glaring design weaknesses in the
>> > previous RFC that should be addressed before it is re-proposed?
>> >
>>
>> I think there are two fundamental decisions which inform a lot of the
>> rest of the design:
>>
>> 1. Are we over-riding *operators* or *operations*? That is, is the user
>> saying "this is what happens when you put a + symbol between two Foo
>> objects", or "this is what happens when you add two Foo objects together"?
>>
>
> If we allow developers to define arbitrary code which is executed as a
> result of an operator, we will always end up allowing the first one.
>
>
>> 2. How do we despatch a binary operator to one of its operands? That is,
>> given $a + $b, where $a and $b are objects of different classes, how do
>> we choose which implementation to run?
>>
>>
> This is something not many other people have been interested in so far,
> but interestingly there is a lot of prior art on this question in other
> languages! :)
>
> The best approach, from what I have seen and developer usage in other
> languages, is somewhat complicated to follow, but I will do my best to make
> sure it is understandable to anyone who happens to be following this thread
> on internals.
>
> The approach I plan to use for this question has a name: Polymorphic
> Handler Resolution. The overload that is executed will be decided by the
> following series of decisions:
>
> 1. Are both of the operands objects? If not, use the overload on the one
> that is. (NOTE: if neither are objects, the new code will be bypassed
> entirely, so I do not need to handle this case)
> 2. If they are both objects, are they both instances of the same class? If
> they are, use the overload of the one on the left.
> 3. If they are not objects of the same class, is one of them a direct
> descendant of the other? If so, use the overload of the descendant.
> 4. If neither of them are direct descendants of the other, use the
> overload of the object on the left. Does it produce a type error because it
> does not accept objects of the type in the other position? Return the error
> and abort instead of re-trying by using the overload on the right.
>
> This results from what it means to `extend` a class. Suppose you have a
> class `Foo` and a class `Bar` that extends `Foo`. If both `Foo` and `Bar`
> implement an overload, that means `Bar` inherited an overload. It is either
> the same as the overload from `Foo`, in which case it shouldn't matter
> which is executed, or it has been updated with even more specific logic
> which is aware of the extra context that `Bar` provides, in which case we
> want to execute the updated implementation.
>
> So the implementation on the left would almost always be executed, unless
> the implementation on the right comes from a class that is a direct
> descendant of the class on the left.
>
> `Foo + Bar`
> `Bar + Foo`
>
> In practice, you would very rarely (if ever) use two classes from entirely
> different class inheritance hierarchies in the same overload. That would
> closely tie the two classes together in a way that most developers try to
> avoid, because the implementation would need to be aware of how to handle
> the classes it accepts as an argument.
>
> The exception to this that I can imagine is something like a container,
> that maybe does not care what class the other object is because it doesn't
> mutate it, only store it.
>
> But for virtually every real-world use case, executing the overload for
> the child class regardless of its position would be preferred, because
> overloads will tend to be confined to the core types of PHP + the classes
> that are part of the hierarchy the overload is designed to interact with.
>
>
>>
>>
>> Finally, a very quick note on the OperandPosition enum: I think just a
>> "bool $isReversed" would be fine - the "natural" expansion of "$a+$b" is
>> "$a->operator+($b, false)"; the "fallback" is "$b->operator+($a, true)"
>>
>>
>> Regards,
>>
>> --
>> Rowan Tommins
>> [IMSoP]
>>
>
> This is similar to what I originally designed, and I actually moved to an
> enum based on feedback. The argument was something like `$isReversed` or
> `$left` or so on is somewhat ambiguous, while the enum makes it extremely
> explicit.
>
> However, it's not a design detail I am committed to. I just want to let
> you know why it was done that way.
>
> Jordan
>
>
> To be clear: I’m very much in favor of operator overloading. I frequently
> work with both Money value objects, and DateTime objects that I need to
> manipulate through arithmetic with others of the same type.
>
> What if I wanted to create a generic `add($a, $b)` function, how would I
> type hint the params to ensure that I only get “addable” things? I would
> expect that to be:
>
> - Ints
> - Floats
> - Objects of classes with “operator+” defined
>
> I think that an interface is the right solution for that, and you can just
> union with int/float type hints: add(int | float | Addable …$operands) (or
> add(int | float | (Foo & Addable) …$operands)
>
> Is this type of behavior even allowed? I think the intention is that it
> must be otherwise the decision over which overload method gets called is
> drastically simplified.
>
> Perhaps for a first iteration, operator overloads only work between
> objects of the same type or their descendants — and if a descendant
> overrides the overload, the descendants version is used regardless of
> left/right precedence.
>
> I suspect this will simplify the complexity of the magic, and solve the
> majority of cases where operator overloading is desired.
>
> - Davey
>

The problem with providing interfaces is something the nikic addressed very
early in my design process and convinced me of: an `Addable` interface will
not actually tell you if two objects can be added together. A `Money` class
and a `Vector2D` class might both have an implementation for `operator +()`
and implement some kind of `Addable` interface. But there is no sensible
way in which they could actually be added. Knowing that an object
implements an overload is not enough in most cases to use operators with
them. This is part of the reason that I am skeptical of people who worry
about accidentally using random overloads.

The signature for the implementation in the `Money` class, might look
something like this:

`operator +(Money $other, OperandPosition $position): Money`

while the signature for the implementation in the `Vector2D` class might
look something like this:

`operator +(Vector2D|array $other, OperandPosition $position): Vector2D`

Any attempt to add these two together will result in a `TypeError`.

Classes which have overloads that look like the following would be
something I think developers should be IMMEDIATELY suspicious of:

`operator +(object $other, OperandPosition $position)`
`operator +(mixed $other, OperandPosition $position)`

Does your implementation really have a plan for how to `+` with a stream
resource like a file handler, as well as an int? Can you just as easily use
`+` with the `DateTime` class as you can with a `Money` class in your
implementation?

I think there are very few use cases that would survive code reviews or
feedback or testing that look like any of these signatures.

There are situations in which objects might accept objects from a different
class hierarchy. For instance, with the changes Saki has made there are now
objects for numbers in the BcMath extension. Those are objects that might
be quite widely accepted in overload implementations, since they represent
numbers in the same way that just an int or float might. But I highly doubt
that it's even possible for the overload to accept those sorts of things
without also being aware of them, and if the overload is aware of them it
can type-hint them in the signature.

Jordan

Re: [PHP-DEV] [Pre-RFC Discussion] User Defined Operator Overloads (again)

Reply via email to