Re: [PHP-DEV] [RFC] Explicit call-site pass-by-reference (again)

2020-02-21 Thread Mike Schinkel
> On Feb 21, 2020, at 5:20 PM, Rowan Tommins  wrote:
> 
> On 20 February 2020 14:13:58 GMT+00:00, Nikita Popov  
> wrote:
>> Hi internals,
>> 
>> I'd like to start the discussion on the "explicit call-site
>> pass-by-reference" RFC again:
>> https://wiki.php.net/rfc/explicit_send_by_ref
> 
> My instinctive reaction is still one of frustration that the pain of removing 
> call-site ampersands was in vain, and I will now be asked to put most of them 
> back in.

That is a great example of what is known as a "sunken cost."  

In summary "A a sunken cost is a cost paid in the past that is no longer 
relevant to decisions about the future."

> It's also relevant that users already find where & should and should not be 
> used very confusing.

One of the reasons it is confusing is because developers are currently required 
to use the ampersand in one place and not the other.  Making it always used 
removes said confusion as they would no longer be a reason to have to remember 
when and when not to use the ampersand anymore.

> There is a potential "PR" cost of this change that should be weighed against 
> the advantages.

To say "We fixed something that in hindsight we've since determined was a 
problem." How is this a concern?   

And when has the PHP community primarily worried about PR cost anyway, except 
with Hack starting eating PHP's lunch in terms of performance?  

> I'm also not very keen on internal functions being able to do things that 
> can't be replicated on userland, and this RFC adds two: additional behaviour 
> for existing "prefer-ref" arguments, and new "prefer-value" arguments.

I used to have the same preference.  And then I realized that languages that 
allow everything and do not withhold low-level functionality allows userland to 
create of DSL-like extensions that can result in highly fragile and obtuse 
architectures.  Just look at Ruby.

And yes that is an abstraction, but so is a generic concern about adding 
internal functions that cannot be leveraged in userland.  

So what specific problems would having these enhancement cause for the language?

> My current opinion is that I'd rather wait for the details of out and inout 
> parameters to be worked out, and reap higher gains for the same cost. For 
> instance, if preg_match could mark $matches as "out", I'd be more happy to 
> run in a mode where I needed to add a call-site keyword.

This sounds like preferring perfect in the (potentially distant) future vs. 
much better today.

If this feature does not block some abstract vision for a perfect future and is 
something that can be delivered in the short term to solve real-world problems 
today, why stand in its way?

-Mike

Re: [PHP-DEV] [RFC] [DISCUSSION] Immutable/final/readonly properties

2020-02-21 Thread Andreas Hennings
When writing immutable classes, I want to be able to set properties in
static factories and in wither methods.

Once the new instance is sent to the outside world, its properties can be
locked to prevent further modification.

This sounds to me like we need different modes. Either the object itself
would have different states over time, or the object stays the same and
instead some methods have mutation permission on newly created objects.

This could be seen as a runtime state problem or as a compile time code
verification problem.

On Sat, 22 Feb 2020, 00:18 Larry Garfield,  wrote:

> On Fri, Feb 21, 2020, at 4:29 AM, Máté Kocsis wrote:
> > >
> > > Yeah, I'm definitely thinking in relation to the earlier discussion,
> since
> > > I think they're all inter-related.  (This, property accessors, and
> constant
> > > expressions.)
> > >
> >
> > The biggest question is whether it's worth to support both readonly
> > properties and property accessors. My answer is clear yes, because there
> > are many-many
> > ways to mess with private or protected properties without and public
> > setters from the outside - in which case property accessors couldn't help
> > much. I collected some examples I know of:
> > https://3v4l.org/Ta4PM
>
> I didn't even know you could do some of those.  That's horrifying. :-)
>
> > > As Nikita notes above, a read-only property with a default value is...
> > > basically a constant already.  So that's not really useful.
> >
> > I agree that they are not very useful, however I wouldn't restrict their
> > usage. Mainly because there are probably some legitimate use-cases, but I
> > also think it would
> > be advantageous to be consistent with the other languages in this case.
>
> If they could do something that class constants can't, that would make
> them useful.  If not, then I feel like it would just be introducing new
> syntax for the same thing, without much benefit.  (I'm thinking of, eg,
> could you set them by default to a new Foo() object, which you could then
> modify the Foo but not change it for another object, thus moving that
> initialization out of the constructor?  That sort of thing.)
>
> > > If we could address the performance impact, that would give us much
> more
> > > functionality-for-the-buck, including an equivalent of read-only
> properties
> > > including potentially lazy initialization.  Or derive-on-demand
> behavior
> > > would also be a big increase in functionality.
> > >
> > > It's not that I don't see a value to this RFC; I actually have a few
> > > places in my own code where I could use it.  It's that I see it as
> being of
> > > fairly narrow use, so I'm trying to figure out how to increase it so
> that
> > > the net-win is worth it.
> > >
> >
> > The reason why I brought up this RFC is that I'd really like to add
> > first-class support for immutable objects, and it seemed to be a good
> idea
> > to first go for readonly properties.
> > This way, the scope of an immutable object RFC gets smaller, while it's
> > possible to only have readonly properties alone.
> >
> > Regards,
> > Máté
>
> I'm totally on board for better value object support, so that's a good
> motive for me.  The question I have is whether this is really a good
> stepping stone in that direction or if it would lead down a wrong path and
> lock us into too much TIMTOWTDI (for the Perl fans in the room).  So let's
> think that through down that path.  How would write-once properties lead
> into properly immutable value objects?  Or do they give us that themselves?
>
> The biggest challenge for immutable objects, IMO, is evolving them.  Eg,
> $result->withContentType(...) to use the PSR-7 example.  Would we expect
> people to do it with a method like that, or would there be some other
> mechanism?  If the properties are public, would we offer a more syntactic
> way to modify them directly?
>
> The with*() method style requires cloning the object.  What happens to the
> locked status of a set property if the object is cloned?  Are they then
> settable again, or do they come pre-locked?
>
> Neither of those seem good, now that I think about it.  If they come
> pre-locked, then you really can't clone, change one property, and return
> the new one (as is the standard practice now in that case).  If they don't
> come pre-locked, then the newly created object can have everything on it
> changed, once, which creates a loophole.  I'm not sure what the right
> answer is here.
>
> My other concern is a public property (the most likely use case) would
> have to be set in the constructor.  If it's not, then callers cannot rely
> on it having been set yet if it's set lazily.  And if code inside the class
> tries to set it lazily, it may already have been set by some external code
> (rightly or wrongly) and cause a failure.
>
> How do we address that?  There's absolutely use cases where setting
> everything in the constructor ahead of time is what you'd do anyway, but
> there are plenty where you 

Re: [PHP-DEV] [RFC] [DISCUSSION] Immutable/final/readonly properties

2020-02-21 Thread Larry Garfield
On Fri, Feb 21, 2020, at 4:29 AM, Máté Kocsis wrote:
> >
> > Yeah, I'm definitely thinking in relation to the earlier discussion, since
> > I think they're all inter-related.  (This, property accessors, and constant
> > expressions.)
> >
> 
> The biggest question is whether it's worth to support both readonly
> properties and property accessors. My answer is clear yes, because there
> are many-many
> ways to mess with private or protected properties without and public
> setters from the outside - in which case property accessors couldn't help
> much. I collected some examples I know of:
> https://3v4l.org/Ta4PM

I didn't even know you could do some of those.  That's horrifying. :-)

> > As Nikita notes above, a read-only property with a default value is...
> > basically a constant already.  So that's not really useful.
> 
> I agree that they are not very useful, however I wouldn't restrict their
> usage. Mainly because there are probably some legitimate use-cases, but I
> also think it would
> be advantageous to be consistent with the other languages in this case.

If they could do something that class constants can't, that would make them 
useful.  If not, then I feel like it would just be introducing new syntax for 
the same thing, without much benefit.  (I'm thinking of, eg, could you set them 
by default to a new Foo() object, which you could then modify the Foo but not 
change it for another object, thus moving that initialization out of the 
constructor?  That sort of thing.)

> > If we could address the performance impact, that would give us much more
> > functionality-for-the-buck, including an equivalent of read-only properties
> > including potentially lazy initialization.  Or derive-on-demand behavior
> > would also be a big increase in functionality.
> >
> > It's not that I don't see a value to this RFC; I actually have a few
> > places in my own code where I could use it.  It's that I see it as being of
> > fairly narrow use, so I'm trying to figure out how to increase it so that
> > the net-win is worth it.
> >
> 
> The reason why I brought up this RFC is that I'd really like to add
> first-class support for immutable objects, and it seemed to be a good idea
> to first go for readonly properties.
> This way, the scope of an immutable object RFC gets smaller, while it's
> possible to only have readonly properties alone.
> 
> Regards,
> Máté

I'm totally on board for better value object support, so that's a good motive 
for me.  The question I have is whether this is really a good stepping stone in 
that direction or if it would lead down a wrong path and lock us into too much 
TIMTOWTDI (for the Perl fans in the room).  So let's think that through down 
that path.  How would write-once properties lead into properly immutable value 
objects?  Or do they give us that themselves?

The biggest challenge for immutable objects, IMO, is evolving them.  Eg, 
$result->withContentType(...) to use the PSR-7 example.  Would we expect people 
to do it with a method like that, or would there be some other mechanism?  If 
the properties are public, would we offer a more syntactic way to modify them 
directly?

The with*() method style requires cloning the object.  What happens to the 
locked status of a set property if the object is cloned?  Are they then 
settable again, or do they come pre-locked?

Neither of those seem good, now that I think about it.  If they come 
pre-locked, then you really can't clone, change one property, and return the 
new one (as is the standard practice now in that case).  If they don't come 
pre-locked, then the newly created object can have everything on it changed, 
once, which creates a loophole.  I'm not sure what the right answer is here.

My other concern is a public property (the most likely use case) would have to 
be set in the constructor.  If it's not, then callers cannot rely on it having 
been set yet if it's set lazily.  And if code inside the class tries to set it 
lazily, it may already have been set by some external code (rightly or wrongly) 
and cause a failure.

How do we address that?  There's absolutely use cases where setting everything 
in the constructor ahead of time is what you'd do anyway, but there are plenty 
where you wouldn't want to, either, which creates a race condition for who sets 
it first, or tries to access it before it gets set, etc.  (This is where my 
repeated questions about lazy initialization come from.)

--Larry Garfield

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Explicit call-site pass-by-reference (again)

2020-02-21 Thread Rowan Tommins
On 20 February 2020 14:13:58 GMT+00:00, Nikita Popov  
wrote:
>Hi internals,
>
>I'd like to start the discussion on the "explicit call-site
>pass-by-reference" RFC again:
>https://wiki.php.net/rfc/explicit_send_by_ref


Hi Nikita,

Thanks for putting the case for this so clearly. My instinctive reaction is 
still one of frustration that the pain of removing call-site ampersands was in 
vain, and I will now be asked to put most of them back in. It's also relevant 
that users already find where & should and should not be used very confusing. 
There is a potential "PR" cost of this change that should be weighed against 
the advantages.

I'm also not very keen on internal functions being able to do things that can't 
be replicated on userland, and this RFC adds two: additional behaviour for 
existing "prefer-ref" arguments, and new "prefer-value" arguments.

My current opinion is that I'd rather wait for the details of out and inout 
parameters to be worked out, and reap higher gains for the same cost. For 
instance, if preg_match could mark $matches as "out", I'd be more happy to run 
in a mode where I needed to add a call-site keyword.

Regards,

-- 
Rowan Tommins
[IMSoP]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] [DISCUSSION] Immutable/final/readonly properties

2020-02-21 Thread Máté Kocsis
>
> Of course, that does leave the question of how often you need one or the
> other. Maybe just the asymmetric visibility is sufficient for most
> practical purposes, in which case it may not be worthwhile to introduce
> readonly properties as a separate feature.
>

The examples shown in my previous email are indeed not very practical, but
still, I would say that the added protection against possible misuse or
accidental modifications (coming from either inside or outside) would be
useful.

Maybe it would make more sense to forbid readonly properties with default
> values?
>

As I mentioned in my response to Larry, my point of view is that default
values should be allowed. If there is a big opposition against this, I'm
open for a change though.


> Regarding the keyword choice, I think you can drop "sealed" from the list,
> as it is an established term that affects inheritance, not mutability. Of
> the choices you present, "immutable", "readonly" and "writeonce" seem like
> the most viable candidates.
>

Thank you for the suggestions! Sure, we can drop "sealed", and I'm ok to
add "immutable" and "readonly" to the list of voting choices. I'll also
extend the evaluations with your thoughts.

Regard,
Máté


Re: [PHP-DEV] [RFC] [DISCUSSION] Immutable/final/readonly properties

2020-02-21 Thread Máté Kocsis
>
> Yeah, I'm definitely thinking in relation to the earlier discussion, since
> I think they're all inter-related.  (This, property accessors, and constant
> expressions.)
>

The biggest question is whether it's worth to support both readonly
properties and property accessors. My answer is clear yes, because there
are many-many
ways to mess with private or protected properties without and public
setters from the outside - in which case property accessors couldn't help
much. I collected some examples I know of:
https://3v4l.org/Ta4PM

Please note that the first two examples also apply to private properties,
while the last one only applies to protected ones.


> As Nikita notes above, a read-only property with a default value is...
> basically a constant already.  So that's not really useful.
>

I agree that they are not very useful, however I wouldn't restrict their
usage. Mainly because there are probably some legitimate use-cases, but I
also think it would
be advantageous to be consistent with the other languages in this case.


> For defined-later readonly properties, I'm not sure how the earlier point
> about reading an unintialized property isn't valid.  Currently:
>
> class Foo {
>   public string $bar;
> }
>
> $f = new Foo();
> print $f->bar; // this throws a TypeError.
>
> I would expect the exact same behavior if $bar were marked
> readonly/locked/whatever.  Are you saying that's not the case?
>

Sorry if I didn't exactly get the question/example, but what I can tell you
is that currently an Error exception is thrown with the
"Typed property Foo::$bar must not be accessed before initialization"
message, and it would be the case with my patch as well
since it doesn't affect the reading side.

The situation is the same when it comes to unsetting uninitialized typed
properties. Currently, these properties can be unset with no problem (and
the
__get(), __set() etc. magic methods are then invoked when accessing them),
and the same would happen with my patch.


> If we could address the performance impact, that would give us much more
> functionality-for-the-buck, including an equivalent of read-only properties
> including potentially lazy initialization.  Or derive-on-demand behavior
> would also be a big increase in functionality.
>
> It's not that I don't see a value to this RFC; I actually have a few
> places in my own code where I could use it.  It's that I see it as being of
> fairly narrow use, so I'm trying to figure out how to increase it so that
> the net-win is worth it.
>

The reason why I brought up this RFC is that I'd really like to add
first-class support for immutable objects, and it seemed to be a good idea
to first go for readonly properties.
This way, the scope of an immutable object RFC gets smaller, while it's
possible to only have readonly properties alone.

Regards,
Máté


Re: [PHP-DEV] New PCRE function

2020-02-21 Thread Nico Oelgart
On Thu, Feb 20, 2020 at 10:12 AM Nikita Popov  wrote:

> FWIW, it is our established stance that all error messages must be
> capitalized. Lower-case first character is only permitted if it is part of
> a function name, or similar cases.
>

Thanks for clearing this up, Nikita.

Given that, I think this lower vs uppercase debate should be a
different discussion held separately if there's real interest in
changing the current stance on this.

Besides that, any other thoughts on the PR?


Re: [PHP-DEV] [RFC] Explicit call-site pass-by-reference (again)

2020-02-21 Thread Nikita Popov
On Fri, Feb 21, 2020 at 12:05 AM Larry Garfield 
wrote:

> On Thu, Feb 20, 2020, at 8:47 AM, Levi Morrison via internals wrote:
> > Just chiming in to voice strong support for this RFC. This is a key
> > piece toward making PHP code statically analyzable. If it becomes
> > required at the call site, such as in an edition of the language, it
> > will significantly enhance the ability to reason about code and
> > probably make it more correct as well. As a small example, consider
> > this method on an Optional type class:
> >
> > function map(callable $f): Optional {
> >   if ($this->enabled) {
> > return new Optional($f($this->data));
> >   } else {
> > return $this;
> >   }
> > }
> >
> > The intent is to return a new optional or an empty one, but if you
> > pass a closure that accepts something by reference you can change the
> > original, which is not intended at all. For people who defend against
> > it, it requires saving `$this->data` to a local variable, then passing
> > in the local. Then if the user does a call-by-reference it will affect
> > the local, not the object's data.
>
>
> If $this->data is itself an object, then you have a concern for data
> manipulation (spooky action at a distance) even if it's passed by value.
> Given how much data these days is objects, and thus the problem exists
> regardless of whether it's by value or by reference passing, adding steps
> to make pass-by-reference harder doesn't seem to help much.
>

If you will allow me some exaggeration, what you're basically saying here
is that all the const / readonly / immutability features in (nearly) all
programming languages are useless, because they (nearly) always allow for
interior mutability in one way or another. "const" in JavaScript doesn't
allow you to rebind the object, but you can still modify the object. Same
with "final" in Java. Similar things hold in C/C++/Rust when it comes to
const pointers/references to structs that contain non-const
pointers/references. And of course, the "readonly" RFC for PHP that is
currently under discussion has the same characteristics.

What I'm trying to say here: All of these features do not guarantee
recursive immutability, but that doesn't render them useless in the least.
In fact, the outer-most layer is where immutability is the most important,
because there's a lot of difference between

$i = 0;
var_dump($i); // int(0)
foo($i);
var_dump($i); // array(7) { ... }
// WTF just happened???

and

$o = new Foo();
var_dump($o); // object(Foo) #42 { xxx }
foo($o);
var_dump($o); // object(Foo) #42 { yyy }
// Did something change in there? Doesn't really matter for this code!

One of the big differences is that by-reference passing can change the
*type* of the variable, while by-object passing cannot. It cannot even
change object identity.

On a closing note: I don't think this RFC makes passing by reference
"harder" in any meaningful sense. Yes, you do need to write one extra
character. In exchange, every time you read code you will immediately see
that by-reference passing is used, here be dragons.

Regards,
Nikita