Re: [PHP-DEV] [Pre-RFC Discussion] User Defined Operator Overloads (again)

Rob Landers Tue, 17 Sep 2024 13:38:45 -0700


On Tue, Sep 17, 2024, at 21:25, Rowan Tommins [IMSoP] wrote:
> On 17/09/2024 18:15, Jordan LeDoux wrote:
>> 
>>> 1. Are we over-riding *operators* or *operations*? That is, is the user 
>>> saying "this is what happens when you put a + symbol between two Foo 
>>> objects", or "this is what happens when you add two Foo objects together"?
>> 
>> If we allow developers to define arbitrary code which is executed as a 
>> result of an operator, we will always end up allowing the first one.
> 
> 
> I don't think that's really true. Take the behaviour of comparisons in your 
> previous RFC: if that RFC had been accepted, the user would have had no way 
> to make $a < $b and $a > $b have different behaviour, because the same 
> overload would be called, with the same parameters, in both cases.
> 
> Slightly less strict is requiring groups of operators: the Haskell "num" 
> typeclass (roughly similar to an interface) requires definitions for all of 
> "+", "*", "abs", "signum", "fromInteger", and either unary or binary "-". It 
> also defines the type signatures for each. If this was the only way to 
> overload the "+" operator, users would have to really go out of their way to 
> use it to mean something unrelated addition.
> 
> As it happens, Haskell *does* allow arbitrary operator overloads, and in fact 
> goes to the other extreme and allows entirely new operators to be invented. 
> The same is true in PostgreSQL - you can implement the <<//-^+^-//>>  
> operator if you want to.
> 
> I think it's absolutely possible - and desirable - to choose a philosophical 
> position on that spectrum, and use it to drive design decisions. The choice 
> of "__add" vs "operator+" is one such decision.
> 
> 
> 
>>  
>> The approach I plan to use for this question has a name: Polymorphic Handler 
>> Resolution. The overload that is executed will be decided by the following 
>> series of decisions:
>> 
>> 1. Are both of the operands objects? If not, use the overload on the one 
>> that is. (NOTE: if neither are objects, the new code will be bypassed 
>> entirely, so I do not need to handle this case)
>> 2. If they are both objects, are they both instances of the same class? If 
>> they are, use the overload of the one on the left.
>> 3. If they are not objects of the same class, is one of them a direct 
>> descendant of the other? If so, use the overload of the descendant.
>> 4. If neither of them are direct descendants of the other, use the overload 
>> of the object on the left. Does it produce a type error because it does not 
>> accept objects of the type in the other position? Return the error and abort 
>> instead of re-trying by using the overload on the right.
> 
> 
> This is option (g) in my list, with the additional "prefer sub-classes" rule 
> (step 3), which I agree would be a good addition.
> 
> As noted, it doesn't provide symmetry, because step 4 depends on the order in 
> the source code. Option (c) is the same algorithm without step 4, so 
> guarantees that $a + $b and $b + $a will always call the same method.
> 
> Options (d), (e), and (f) each add an extra step: one operand can signal "I 
> don't know" and the other operand gets a chance to answer. They're 
> essentially ways to "partially implement" an operator.
> 
> Options (a) and (b) perform the same kind of polymorphic resolution on *both* 
> operands, which is how many languages work for functions and/or methods 
> already. 
> 
> 
> 
> Reading the C# spec, if there is more than one candidate overload which is 
> equally specific, an error is raised. I guess you could do the same even with 
> one implementation per class, by replacing step 4 in your algorithm:
> 
> 
> > 4. If neither of them are direct descendants of the other, and only one 
> > implements the operator, use it.
> > 5. If neither of them are direct descendants of the other, and both 
> > implement the operator, throw an error.
> 
> Let's call that option (h) :)
> 
> 
> 
> By the way, searching online for the phrase "Polymorphic Handler Resolution" 
> finds no results other than you saying it is the name for this algorithm.
> 
> 
> 
>> This is similar to what I originally designed, and I actually moved to an 
>> enum based on feedback. The argument was something like `$isReversed` or 
>> `$left` or so on is somewhat ambiguous, while the enum makes it extremely 
>> explicit.
> 
> 
> Ah, fair enough. Explicitness vs conciseness is always a trade-off. My 
> thinking was that the "reversed" form would be far more rarely called than 
> the "normal" form; but that depends a lot on which resolution algorithm is 
> used.
> 
> 
> 
> Regards,
> 
> -- 
> Rowan Tommins
> [IMSoP]


To be honest, this juggling of caller orders has me a bit concerned. For 
example, matrix multiplication isn’t communitive, as are non-abelion groups in 
general (quaternions being another popular system), but, I am used to Scala, 
where the left-hand is the one always called.

I understand that this is what the operant position is for, but it strikes me 
as something that extreme care has to be called for when working with these 
types of objects when another object is involved. For example, quaternions can 
be multiplied by a matrix and the order is super important (used for 3d 
rotations) but it appears the actual method called may not be deterministic 
because these classes may be unrelated, the one on the left is called, which 
may or may not result in a correct answer. All that is to say, this is just to 
illustrate how complex this ordering algorithm seems to be. Depending on how 
the libraries are implemented and whether they are designed to work together. 

I would prefer to see something simple, and easy to reason about. We can abuse 
some mathematical properties to result in something quite simple:

 1. If both are scalar, use existing logic. 
 2. If one is scalar and the other is not, use existing logic. 
 3. If one is scalar and the other overrides the operation, rearrange the 
operation per its communitive rules so the object is on the left. $scalar + 
$obj == $obj + $scalar; $scalar - $obj == -$obj + $scalar, -($obj - $scalar). 
It is generally accepted (IIRC) that when scalars are involved, we don’t need 
to be concerned with non-abelion groups.
 4. If both are objects, use the one on the left.
I think this is much easier to reason about (you either get a scalar or another 
object) that doesn’t involve a developer deeply understanding the inheritance 
of the objects in question or to understand the algorithm for choosing which 
one will be called. 

— Rob

Re: [PHP-DEV] [Pre-RFC Discussion] User Defined Operator Overloads (again)

Reply via email to