Re: [PHP-DEV] [Pre-RFC Discussion] User Defined Operator Overloads (again)

Jordan LeDoux Tue, 17 Sep 2024 10:27:26 -0700

On Tue, Sep 17, 2024 at 2:14 AM Mike Schinkel <m...@newclarity.net> wrote:


> > On Sep 17, 2024, at 1:37 AM, Jordan LeDoux <jordan.led...@gmail.com>
> wrote:
> > On Mon, Sep 16, 2024 at 9:35 PM Mike Schinkel <m...@newclarity.net>
> wrote:
> >
> > Yes, if constraints of the nature I propose below are adopted.
> >
> > The biggest problem I have with operator overloads is that — once added
> — all code could potentially be "infected" with operator overloads.
> However, if the developer *using* an operator overload could instead opt-in
> to using them, in context, then I would flip my opinion and I would begin
> to support them.
> >
> > What might opt-in look like?  I propose two (2) mechanisms of which each
> would be useful for different use-cases. As such I do not see these two as
> competing but instead would expect adding both to be preferable:
> >
> > 1. Add a pair of sigils to enclose any expression that would need to
> support userland operator overloading. This would allow a developer to
> isolate just the expression that needs to use operator overloading. I
> propose {[...]} for this, but feel free to bikeshed sigils. Using an
> example from the RFC, here is what code might look like:
> >
> > $cnum1 = new ComplexNumber(1, 2);
> > $cnum2 = new ComplexNumber(3, 4);
> > $cnum3 = {[ $cnum1 * $cnum2 ]};               // Uses operator
> operloading sigils
> > echo $cnum3->realPart.' + '.$cnum3->imaginaryPart.'i';
> >
> > 2. For when using `{[...]}` would be annoying because it would be needed
> in so many places, PHP could also add support for an attribute. e.g.
> `#[OperatorOverloads(Userland:true)]`. This attribute would apply to
> functions, methods, classes, enums, (other?) and indicates that operator
> overloads can be present anywhere in the body of the decorated structure. I
> included `Userland:true` as an indicator to a reader that this only applies
> to userland operator overloads and that built-in ones like in GMP and
> anywhere else would not need to be opted into, but that parameter could of
> course be dropped if others feel it is not needed. Again, feel free to
> bikeshed attribute name and/or parameters.
> >
> > #[OperatorOverloads(Userland:true)]
> > function SprintProductOfTwoComplex(ComplexNumber $cnum1, ComplexNumber
> $cnum2)string {
> >   $cnum3 = $cnum1 * $cnum2;
> >   return sprintf("%d + %di", $cnum3->realPart, $cnum3->imaginaryPart);
> > }
> >
> > If this approach were included in the RFC then it would also ensure
> there is no possibility of BC breakage. BC breakage which would certainly
> be an edge case but I can envision it would be possible,e specially where
> newer instances incorporating operator overloads are passed to functions
> that did not have parameters type hinted but were not intend to be used
> with operator overloads resulting in subtle potential breakage.
> >
> > This argument is also consistent with the argument people had about not
> allowing default values to be generically used in calls to the function
> function. Their claim was that developers who did not write their code with
> the intention of exposing defaults should not have their defaults exposed.
> Similarly developers that do not write their code to enable operator
> overloads should not be used with  userland operator overloads unless they
> explicitly allow it, especially as they may not have have tested code with
> operator overloads.
> >
> > Anyway, that is my two cents worth.
> >
> > TL;DR?  I argue that PHP should operator overloads but ONLY if there is
> a mechanism that requires the user of expressions that call overloaded
> operators to explicitly opt-in to their use.
> >
> > -Mike
> >
> >
> > This is interesting, as I've never seen this in any language I
> researched as part of operator overloading, and also was never given this
> feedback or anything similar by anyone who provided feedback before.
>
> If all language features required prior art, there would never be
> innovation in programming languages. So for anything that currently exists,
> there was always a first language that implemented it.
>
> Of course when there is prior art we can use the heuristic of "All these
> have done it before so it must be a good idea."  But lack of prior art
> should not be the reason to dismiss something, it should be evaluated on
> its merits.
>
> > My initial reaction is that I do not understand how this is any better
> than parameter typing. If you do not allow any objects into the scope you
> are using operators, wouldn't that be the same as the kind of userland
> control you are after? Or rather, how would it be substantially worse?
>
> How would a developer know if they are using an object that has operators,
> unless they study all the source code or at least the docs (assuming there
> are good docs, which there probably are not?)
>
> It might be illustrative to explicitly call out different scenarios I
> envision in case some are not obvious.
>
> There are:
>
> 1. Internal projects that are almost entirely bespoke code, with an active
> team where the code is run by the code owners. Think a big company's
> internal operations.
>
> 2. Agencies that build web projects using frameworks and libraries for
> clients.
>
> 3. Smaller companies using frameworks and libraries for internal use, with
> a small team that may have many other duties, or those who outsource to
> contractors when they need things, and breakage for them is can be very
> painful.
>
> 4. Framework developers
>
> 5. Library developers
>
> 6. And probably a bunch of other scenarios, each slightly different.
>
> Each of those scenarios have a different level of knowledge about the code
> they work on. I'd expect #2 & #3 to have the least knowledge of the code
> they use and would be most effected by other people's code doing things
> they do not expect.
>
> I'd argue that #1 would have better knowledge of their code and would be
> less affected by other people's code, except they probably have a huge
> amount of bespoke code so one developer likely does not know what another
> developer is doing, and especially if they have teams that developer tools
> for other teams to use.
>
> Lastly #4 and #5 likely know their codebases the best, but they may create
> footguns for developers in category #2 and #3 if the language allows them
> to. And vice-versa.
>
> So back to your question "If you do not allow any objects into the scope
> you are using operators wouldn't that be the same as the kind of userland
> control you are after?" So I ask — How do I know if the objects I am using
> that were developed by others use operators or not? With free-reign
> userland operator overloads we would be required to dig into the source for
> the code written by others that we use to ensure I know if they have
> operators and how they work.
>
> OTOH with my suggestion, we will know because the code will crash when no
> opt-in is used.
>
> Note, I refer to cases where code that calls code evolves, uses dynamic
> programming, and/or accepts mixed types. And I am especially talking about
> when developers create classes to wrap a built-in type and then implement
> operators, but add special cases to them such as a String() class that
> implements the concatenation operator but with a twist.
>
> > Your second example even includes a function that only accepts a
> `ComplexNumber` object. I presume in your example there that if the
> Attribute was removed, the function would just always produce a fatal
> error, since that is the behavior of objects when used with `*`.
>
> Yes, that was the intention for the attribute, or lack of attribute in the
> case you describe.
>
> >
> > What it appears to me your proposal does is transform working operator
> overloads into fatal errors if the user-code does not "opt-in".
>
> Correct.
>
> > But any such code would never actually survive long, wouldn't it?
>
> That is the feature, not a bug.
>
> > Without the opt-in, these objects would ALWAYS produce fatal errors
> (which is what happens now),
>
> Well, we do not have operator overloads right now. With operator overloads
> they could run without crashing but have subtle bugs.
>
> Note I am not referring to highly specific functions written for highly
> specific classes which is what I suspect you are envisioning. Based on your
> past comments those seem to be the areas you operate in, i.e. math-related.
>
> I am instead referring to code that is written to be generic but that ends
> up running code it did not intend to run because of edge cases that are
> exposed by userland operators.
>
> > which would eventually show up in testing, QA, etc.
>
> Eventually.  Assuming they have a good testing and QA process which many
> PHP projects do not. PHP is a least-common denominator language because it
> is one of the easiest to get started with. Many less experienced PHP
> developers do not have good testing and QA processes.
>
> But even if they do have good testing and QA, the sooner the bugs appear
> the less likely they will get deployed.
>
> > The developer would realize that they (presumably) were trying to do a
> math operation on something they thought was only a numeric type, and then
> guard against objects being passed into that context with control
> statements, parameter types, etc.
>
> Exactly. In my proposed concept they would rework their expressions to
> opt-in to using the overloaded operators once they ensure that they
> understand how the code operates.
>
> > So it seems to me what this ACTUALLY guards against is developers who
> inadvertently don't type-check their variables in code where the specific
> type is relevant.
>
> OR do not fully know the details of the types they are using.
>
> OR they are using types that have been upgraded to now support operator
> overloading, but they do not realize that.
>
> > After one round of testing, all of the code using operators would either
> always allow objects and thus overloads, or never allow objects and thus
> not use overloads.
>
> That assumes they crash. I am concerned for when they do not crash but
> instead have subtle bugs.
>
> > There shouldn't even be any existing code that would be affected, since
> any existing code would need to currently allow objects in a context where
> operators are used, which currently produces a fatal error 100% of the
> time, (excepting internal classes which are mostly final anyway, and thus
> unaffected by this proposal).
>
> It is correct that no old code can call other old code and use operators
> on objects.
>
> But *new* code could call old code and then that old code could be made to
> run operators without ever intending to be run in that manner.
>
> > What is the situation where your suggestion is implemented, a developer
> does NOT opt-in to overloads, and they avoid unexpected behavior without
> having to change their existing code to fix fatal errors? I don't see how
> that is possible.
>
> In your hypothetical it appears you referred to only one developer. But
> where I see issues is when there are two or more developers; a producer of
> functions and a consumer of functions.
>
> Situation where there is free-reign userland operator overloading:  Junior
> developer Joe is using Symfony and learns about this great new operator
> overload feature so decides to implement all the operators for all his
> objects, and now he wants to start passing his objects to Symphony code.
> Joe decides to be clever and implement "/" to concatenate paths strings
> together but doesn't type his properties, and he ends up passing them to a
> Symfony function that uses `/` for division, and his program crashes with
> very cryptic error messages.  He reports them to the Symfony developers,
> and it wastes a bunch of time for everyone until they finally figure out
> why it failed, because nobody every considered a developer would do such a
> thing.
>
> Same scenario but with required opt-in. Joe does the same thing but this
> time he gets a very clear message that says "Symfony Widget does not
> support operator overloads."  He googles and quickly finds out that what
> that means and then goes to ask the Symfony team to support operator
> overloads. They can choose to either add support, or not, but it is up to
> them if they want to open the can of worms related to support that operator
> overloading might cause.
>
> > Also, replying into a 3 year old reddit thread I linked to for reference
> is not what I intended, however I want to highlight one other thing you
> commented there but not here for some reason:
> >
> > > To illustrate my point, imagine if we also allowed control structure
> overloads. If we had them we could no longer read code and know that an
> `if` is a branch and a `for` is a loop; either could be anything valid for
> any control structure. Talk about ambiguity!
> >
> > Indeed. I want to make sure that I have not been ambiguous after reading
> this, because I found it somewhat troubling:
> >
> > I am looking at writing an RFC for specific *operators* that are finite
> and defined within the RFC. I am not proposing something that would allow
> control structures to be altered (I don't even think that would be possible
> without essentially rewriting the entire Zend Engine specifically to do it).
> >
> > Operators are not control structures. Operators mutate the value or
> state of a variable in a repeatable way, given the input states. There is
> not even a generalized mechanism in my RFC for "arbitrary" overloads, and
> the compiler was not implemented in a way that is generalized for it
> either. It allows only exactly the operators that are part of the RFC, and
> each are handled specifically and individually.
>
> I was ONLY using control structures as a more extreme analogy to operator
> overloading to try to illustrate how — the more things you make
> configurable in a language — the more you allow the ground to shift beneath
> a developer's feet, so to speak.
>
> An approach I use when trying to understand something that might be subtle
> is to ask myself what a more extreme example is that would be analogous and
> then I consider that.
>
> So I was not saying you proposed that, I was equating control structure
> overloading to operator overloading, but I explicitly meant control
> structure overloading would be a more extreme opening up of PHP than
> operator overloading.
>
> Clearly control structure overloading would be bad. I was trying to make
> the point that operator overloading would cause problems for the same
> reason, even if the problems would not be as extreme.
>
> I am sorry that my wording did not make it clear that I was using an
> analogy, not referring to your RFC.
>
> Anyway, as a closing for this email, I know you badly want operator
> overloading but there were enough people who disliked the idea to vote
> against it last time so — assuming my proposal could satisfy them too — it
> seems like a great compromise to give you true operator overloading with
> just a little extra boilerplate while at the same time allowing developers
> to limit the scope of operator overloads to just those function where they
> want to enable it.
>
> What's more, if after a few years we find out that my concerns really were
> for naught then a future RFC could open it up and remove the opt-in
> requirement.
>
> But one thing is certain, if we open up operator overloading completely
> one day one we could never go back to opt-in.
>
> -Mike


While I do not presume to speak for all voters (I don't even have voting
rights myself), my feeling from all of the conversations I have had over
almost the last 4 years is that implementing your suggestion would
virtually guarantee that the RFC is declined. You are suggesting providing
a new syntax (which voters tend to be skeptical of) to create a situation
where more errors occur (which voters tend to be skeptical of) to solve a
problem which can be solved with existing syntax by simply type guarding
your code to not allow any objects near your operators (which voters tend
to be skeptical of) for which I cannot find any code examples that explain
the problem it is solving (which voters tend to skeptical of).

Jordan

Re: [PHP-DEV] [Pre-RFC Discussion] User Defined Operator Overloads (again)

Reply via email to