Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

Rowan Tommins [IMSoP] Mon, 04 Mar 2024 08:48:32 -0800

On 27/02/2024 23:17, Larry Garfield wrote:
> 
> A little history here to help clarify how we ended up where we are: The 
> original RFC as we designed it modeled very closely on Swift, with 4 hooks.  
> Using get/set at all would create a virtual property and you were on your 
> own, while the beforeSet/afterSet hooks would not.



This is interesting to hear, because the current RFC comes across - or at least 
did to me, on first reading - as "these are normal properties, and then you're 
adding some magic on top of them".

The Eureka! moment for me was realising that the whole thing makes more sense 
if you *start* with "virtual" properties, and then add some magic to avoid 
declaring a second property to store a backing value.

This "virtual-first" view is how my current thinking is framed, which I tried 
to demonstrate here: https://wiki.php.net/rfc/property-hooks/imsop-suggestion



> == Re asymmetric typing:
> 
> This is capability already present today if using a setter method.  
> 
> class Person {
>     private $name;
> 
>     public function setName(UnicodeString|string $name)
>     {
>         $this->name  =  $value  instanceof UnicodeString ? $value : new  
> UnicodeString($value);         
>     }
> }


I find this unconvincing. Just because this method name starts with "set", 
doesn't mean everything it does should be possible in a property setter.

You could also have a method that took two arguments:

public function setName(string $first, string $last)
{
    $this->name = $first . ' ' . $last;
}

Or a mandatory argument and an optional flag:

public function setName(string $name, bool $normalise=false)
{
    $this->name = $normalise ? ucwords($name) : $name;
}

Neither of those are going to be translatable to property set hooks, and that's 
totally fine.



> covering an easy-to-cover use case seems like a good thing to do.  


This is the crux: I don't think asymmetric types *are* easy. 

Firstly, they mean that every single piece of static analysis or reflection 
which wants to ask "what is the type of this property" has to take into account 
the new concept that the "settable type" might be different from the "gettable 
type".

Secondly, they mean that a user has to understand that this code might result 
in the variable magically changing type:

$me = 'Rowan';
$object->name = $me;
$me = $object->name;
// $me is now magically an object!
// how do I get my string back?

Thirdly, there's a simpler alternative, providing a separate virtual property, 
which can then be readable as well as writeable:

$me = 'Rowan';
$object->nameString = $me;
$me = $object->nameUnicode;
// easily visible that we're reading a different property, with a different type
$object->nameUnicode = $me;
$me = $object->nameString; 
// fully reversible, and no need to know how the object implements it


> It also ties into the question of the explict/implicit name, for the reason 
> you mentioned earlier (unspecified means mixed)

Another reason to dislike it as complicating the proposal, IMHO.


> == Re virtual properties:
> 
> On the downside, if you have a could-be-virtual property but never actually 
> use the backing value, you have an extra backing value hanging around in 
> memory that is inaccessible normally, but will still show up in some 
> serialization formats, which could be unexpected.

This is a reasonable argument in favour of automatic detection.


> If you omit one of the hooks and forget to mark it virtual, you'll still get 
> the default of the other operation

My immediate question here is "why?" Why does the set hook magically come into 
existence, just because you defined the get hook a particular way?

Consider this example, using a virtual property:

private int $_nextId;
public int $nextId { 
    get { $this->_nextId ??= 0; return $this->_nextId++; }
}

This might not be the most common thing to do, but (as far as I know) it will 
work. There is no direct write access to the property, but we need somewhere to 
store the incrementing value.

Then we say "woof this is complicated having to make my own backing property 
for all these little things; can this be simplified?" and write this:

public int $nextId { 
    get { $this->nextId ??= 0; return $this->nextId++; }
}

Great! We've got rid of the explicit backing field, and the code still works... 
but wait! Suddenly, a setter has appeared, even though we never asked for one!

I suppose the reasoning is that it's quite common to want to implement the 
default version of one or other hook; but note that with the currently proposed 
short-hand the defaults can be written as:

get => $this->someName;
set($value) => $value;

If that's still too long, we can borrow from C#, where they can be written as:

get;
set;

That gives the choice of whether the default is implemented back to the user, 
and removes a foot-gun inconsistency between virtual and backed properties.


> * Doing autodetection as now, but with an added "make a backing value anyway" 
> flag would resolve the use case of "My set hook just calls a method, and that 
> method sets the property, but since the hook doesn't mention the property it 
> doesn't get created" problem.

Unless I'm missing something, no it wouldn't. There's still no way for a method 
to refer to that backing value. If the method sets the *property*, it will just 
end up recursively calling the set hook. The backing value remains visible only 
inside the hooks.

Unless I've misunderstood, and the implementation somehow chooses the meaning 
of $this->foo based on whether the set hook is somewhere in the call stack, in 
which case ... yikes!


> == Re reference-get
> 
> Allowing backed properties to have a reference return creates a situation 
> where any writes would then bypass the set hook, and thus any validation 
> implemented there. 

As Stephen Reay says, isn't that up to the user to decide?

Again, the backed property isn't doing anything that a virtual property plus an 
explicit backing property couldn't:

private string _$foo;
public string $foo {
    &get { return $this->_foo; }
    set { /* whatever */ }
}

Either we're willing to trust the user with that power, and should let them do 
the same thing with a magic backing field; or we're not willing to trust them, 
and should not allow &get at all.



> There is one edge case that *might* make sense: If there is no set hook 
> defined, then there's no set hook to worry about bypassing.  So it may be 
> safe to allow &get on backed properties IFF there is no set hook.

Given my above argument that the set hook should never be added automatically / 
implicitly, this could be simplified to "allow an &get hook only if no other 
hook is defined" (i.e. you can't have both "get" and "&get", and you can't have 
both "&get" and "set").


> == Re arrays
> 
>> The simplest approach would be to copy the array, modify it accordingly, and 
>> pass it to set hook.

This isn't what I was suggesting.

I was suggesting that modifying the array called the &get hook, and modified 
whatever array that returned.


> Unless we were OK with that bypassing the set hook entirely if defined, 
> which, as noted above, means any safety guarantees provided by a set hook are 
> bypassed, leading to untrustworthy code.

Again, already possible as soon as you have any &get hooks at all:

$temp =& $foo->bar;
$temp[42] = true;
var_dump($foo->bar);



> == Re hook shorthands and return values
> 
> Ilija and I have been discussing this for a bit, and we've both budged a 
> little. :-)  Here's our counter-proposal:
> 
> - Drop the "top level" shorthand, for get-only hooks.
> - Keep the => shorthand for the get hook itself.
> - For a set hook, the {} form has no return value; set the value yourself 
> however you want.
> - For a set hook, the => form implies a backed value and will set the 
> property to whatever value that evaluates to.

I'm 100% behind this.



> I genuinely don't understand the pushback on $value.  It's something you 
> learn once and never have to think about again.  It's consistent.

For me, the problem is in having *both* a special name *and* the ability to 
choose the name. There's nowhere else in the language where that happens.

I also can't think of any reason someone would choose a name *other than* 
$value. I can well imagine coding standards mandating that it always be called 
that, making it boring boilerplate.


> Ilija jokingly suggested making it always $value, unconditionally, and 
> allowing only the type to be specified if widening:
> 
> public int $foo { set(int|float) => floor($value); }

Honestly, if you think asymmetric types are a good idea (which I don't), that 
makes a lot more sense. 

Specifying the writeable type has nothing to do with the name of the value, 
it's a special case that will be rarely used, and should draw attention to its 
key feature: the type.


> The alternative that gives the most future-flexibility is to do neither: The 
> variable is called $value, period, you can't change it, and you can't change 
> the type, either.  There is no () after set, ever.  Punt both of those to a 
> later follow-up.  I'd prefer to include both now, but including neither now 
> is the next-safer option.

This is by far my preferred option. Asymmetric types are too much magic, and 
choosing the variable name is just one more case to consider.



> ## Regarding $field
> 
> Sigh, now y'all like it. :-P

As I said at the top, the Eureka! moment for me was thinking "virtual first".

In the original RFC, it was implied (or, it seemed to me) that $field was just 
an alias for referencing the "real" property. That's a really tempting 
interpretation, but it's not what's happening.

What's really happening is that the property itself is virtual: every single 
access to it goes through the hooks. But, within the hooks, we have provided a 
magic variable, stored on the object but accessible only there, where the hooks 
can store a value of the same type as the virtual property.

Once I came to that interpretation, it became much more intuitive to call that 
magic variable by a magic name like $field; than to re-use the syntax that 
would normally refer to the property, and make it sometimes reference this new 
thing instead.

To re-iterate an earlier point, though, I think the language should choose. 
There should be exactly one way to refer to the backing field, whether that's 
$this->foo, $field, or get_backing_field(). Don't leave users reading each 
other's code and not being sure if it's doing the same thing.


Regards,
-- 
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

Reply via email to