Re: [PHP-DEV] [RFC] Asymmetric Visibility, v2

2024-06-05 Thread Ilija Tovilo
Hi Tim

On Tue, Jun 4, 2024 at 7:54 PM Tim Düsterhus  wrote:
>
> One thing that would get pretty wonky would be private-read properties:
> Private property names are currently internally "mangled" to include the
> class name. This allows to define the same private property in multiple
> classes of an inheritance chain, without those classes needing to know
> about the private properties of each other and making the addition and
> removal of a private property not a BC break. For all intents and
> purposes those private properties to not exist, unless you are the class
> itself.
>
> I have no idea what the semantics of a public-write, private-read
> property should be - and this problem is pretty similar to the
> sibling-discussion about making private-set properties implicitly final,
> because otherwise the semantics get wonky.
>
> I believe that the case of making a property public-write, private-read
> is best left to a virtual set-only hook.

Indeed. A private property with a more permissible set operation is
the wrong approach. What we'd want here is a public property with a
restricted get operation. This is not quite expressible with the
current syntax. We'd need something like `public private(get)`, or
`public $prop { private get; }` with the C# equivalent. However, this
is quite an edge case, and since it requires additional syntax I don't
think it's something we should support without a specific use-case.
You can emulate it with a set-only virtual property, if you really
want to.

Ilija


Re: [PHP-DEV] Fwd: Request for RFC Karma to Propose any_empty and all_empty Methods

2024-06-03 Thread Ilija Tovilo
Hi Elminson!

On Mon, May 27, 2024 at 6:51 PM Elminson De Oleo Baez  wrote:
>
> I hope this message finds you well. I am writing to request RFC karma for my 
> wiki account in order to propose a new RFC.
>
> My proposal involves the introduction of two new methods, any_empty and 
> all_empty, for working with arrays. These methods are designed to provide 
> boolean outputs indicating whether any of the elements in an array are empty, 
> or if all elements are empty, respectively. I believe these methods will be 
> valuable additions to PHP’s array manipulation functionalities.
>
> Below is a brief overview of the proposed methods:
>
> any_empty(array $array): bool - This method will return true if any element 
> in the provided array is empty, and false otherwise.
> all_empty(array $array): bool - This method will return true if all elements 
> in the provided array are empty, and false otherwise.
> These methods aim to simplify common array checks and improve code 
> readability and efficiency.
>
> I look forward to your approval and any guidance you can provide on moving 
> forward with this proposal.

I'm skeptical personally about these functions. empty() doesn't have
the best semantics, and it's going to be rare that all your input
types exactly follow these semantics. I think a deeper dive into
common validation requirements might be good, to see whether they can
be abstracted in some way. It's also worth noting that validation
entails much more than just yielding true or false, e.g. coercion or
graceful errors.

Anyway, I granted you RFC karma. Good luck!

Ilija


Re: [PHP-DEV] [RFC] [Vote] #[\Deprecated] attribute

2024-06-03 Thread Ilija Tovilo
Hi Matthew

On Mon, Jun 3, 2024 at 3:15 PM Matthew Weier O'Phinney
 wrote:
>
> On Wed, May 22, 2024 at 2:24 AM Benjamin Außenhofer  
> wrote:
>>
>> The vote for the RFC #[\Deprecated] attribute is now open:
>>
>> https://wiki.php.net/rfc/deprecated_attribute
>>
>> Voting will close on Wednesday 5th June, 08:00 GMT.
>
> I have voted no for a few reasons:
>
> - Ideally, I'd like to be able to mark _anything_ as deprecated. In 
> particular, not being able to mark a _class/interface/enum/etc_ as deprecated 
> makes this far less useful.

While it's true that extending #[Deprecated] to classes would be
useful, deprecation already exists as a language concept, and it can
be extended to class-like structures without a BC break.

> - The "since" parameter is basically worthless to me. It's very easy to find 
> out the last version that wasn't deprecated. What would be far more useful to 
> a consumer is an argument indicating when something will be removed (e.g. 
> $toRemoveInVersion, $versionForRemoval, etc.). This helps me as a user plan 
> for the future.

Did you vote yes in the secondary vote by accident? I voted no on the
$since parameter for the same reason:

* "How long have I not fixed this?" is not a particularly useful
question to ask. "When do I have to fix this?" is more relevant.
* The format of $since is intentionally left unstandardized, and it's
unclear (to me?) what it refers to. For example, some packages are
split into multiple, smaller ones (e.g. Doctrine) with diverging
version numbers. The sub-package version number may not be useful to
the end-user, who never requires it directly. Similarly, referencing
the main package version may be confusing, especially if the ranges of
recent main and sub-package versions overlap.

Ilija

> - The "since" parameter is basically worthless to me. It's very easy to find 
> out the last version that wasn't deprecated. What would be far more useful to 
> a consumer is an argument indicating when something will be removed (e.g. 
> $toRemoveInVersion, $versionForRemoval, etc.). This helps me as a user plan 
> for the future.
>
> --
> Matthew Weier O'Phinney
> mweierophin...@gmail.com
> https://mwop.net/
> he/him


Re: [PHP-DEV] [RFC] Transform exit() from a language construct into a standard function

2024-05-28 Thread Ilija Tovilo
On Tue, May 28, 2024 at 2:10 PM Gina P. Banyard  wrote:
>
> On Monday, 27 May 2024 at 02:31, Ilija Tovilo  wrote:
>
> > > On Wednesday, 8 May 2024 at 14:40, Gina P. Banyard intern...@gpb.moe 
> > > wrote:
> > >
> > > > I would like to formally propose my idea for exit() as a function 
> > > > brought up to the list on 2024-02-24 [1] with the following RFC:
> > > > https://wiki.php.net/rfc/exit-as-function
> >
> >
> > As mentioned early on in private, I don't see a convincing reason to
> > remove tokenizer/parser support for exit/die. I'd rather see this
> > handled in the parser directly, by converting the standalone keywords
> > to function calls. This avoids any backwards incompatibility, and
> > avoids special handling in zend_compile_const.
>
> I must be honest, I don't really understand how parsers work, so I went with 
> what I could understand and implement.
> But I'm struggling to see a reason for keeping the token in the first place, 
> and what issues hooking into zend_compile_const() has.

Mostly because I think handling exit and die as constants is
misleading. exit; isn't a constant any more than yield; is. Instead,
you could turn exit; into the same AST as exit(); (i.e. a function
call), which will make the compiler handle it automatically. If you
wish, I can have a quick look.

> > Another thing that's probably not too important: The PR likely breaks
> > dead code elimination for exit() and die(). This could be re-added by
> > checking for the never return type instead.
>
> Checking for a never return type seems more robust if it wasn't already 
> supported by DCE.
> I will see if I can do this.

That would be great! I agree that this can be done in retrospect.

Ilija


Re: [PHP-DEV] [RFC] Transform exit() from a language construct into a standard function

2024-05-26 Thread Ilija Tovilo
Hi Gina

On Sun, May 26, 2024 at 11:47 PM Gina P. Banyard  wrote:
>
> On Wednesday, 8 May 2024 at 14:40, Gina P. Banyard  wrote:
>
> >
> > I would like to formally propose my idea for exit() as a function brought 
> > up to the list on 2024-02-24 [1] with the following RFC:
> > https://wiki.php.net/rfc/exit-as-function
>
> As there haven't been any comments for nearly two weeks, I'm planning on 
> opening the vote for the RFC on Tuesday.

As mentioned early on in private, I don't see a convincing reason to
remove tokenizer/parser support for exit/die. I'd rather see this
handled in the parser directly, by converting the standalone keywords
to function calls. This avoids any backwards incompatibility, and
avoids special handling in zend_compile_const.

Another thing that's probably not too important: The PR likely breaks
dead code elimination for exit() and die(). This could be re-added by
checking for the never return type instead. You'd also need to
special-case the lookup of exit/die in namespaced code, where it will
always refer to the global function (as they cannot be declared in
userland).

Ilija


Re: [PHP-DEV] [DISCUSSION] Checking uninitialized class properties

2024-05-20 Thread Ilija Tovilo
On Sat, May 18, 2024 at 4:41 PM Rowan Tommins [IMSoP]
 wrote:
>
> On 18/05/2024 11:52, Luigi Cardamone wrote:
> > Are there any downsides in adding a
> > specific syntax to check if a property
> > is initialized with any value?
>
> In my opinion - and I stress that others may not share this opinion -
> the entire concept of "uninitialized properties" is a wart on the
> language, which we should be doing our best to eliminate, not adding
> more features around it.

I fully agree. The reason NULL was deemed the "billion-dollar mistake"
is not because NULL isn't a useful value, it's because it is
implicitly part of types that are commonly never NULL. For example, in
C, there's no way through the type system to convey whether a pointer
returned from a function may or may not be NULL (ignoring hints
through attributes). As such, the user must look at documentation
(which may be inaccurate), or even guess whether NULL must be handled.
A failure to do so will not result in any compilation errors, but
crashes at runtime. Luckily, PHP does not suffer from this issue. null
is only allowed when the type says so.

However, if you leave your class properties uninitialized, you're
essentially recreating this issue. How do you know whether it is safe
to access class properties? May they be uninitialized under certain
conditions? The type system has no way of conveying this information.
There's no way to know, without looking at the implementation. Static
analysis won't be able to catch it for you either.

So, as Rowan suggested, it is better to hint at this "uninitialized"
value that must be handled through the type system, currently most
conveniently through single-case enums and union types.

Ilija


Re: [PHP-DEV] [Discussion] Why can't I do "{$a::class}"?

2024-05-19 Thread Ilija Tovilo
On Sun, May 19, 2024 at 1:15 PM Ilija Tovilo  wrote:
>
> Hi Peter
>
> On Sun, May 19, 2024 at 10:30 AM Peter Stalman  wrote:
> >
> > echo " {A::$static_property} \n"; // doesn't work (unless $static_property 
> > is a variable)
> > echo " {$a::$static_property} \n"; // works
> >
> > echo " {A::static_method()} \n"; // doesn't work (just text)
> > echo " {$a::static_method()} \n"; // works
> >
> > echo " {A::constant} \n"; // doesn't work
> > echo " {$a::constant} \n"; // doesn't work either, but why?
>
> There were also suggestions to extend strings in a more generic way,
> akin to JavaScripts template strings [3], but I didn't have any use
> for this myself.
>
> [3] 
> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals

What I actually wanted to reference were "tagged templates":

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals#tagged_templates

Sorry for the confusion.

Ilija


Re: [PHP-DEV] [Discussion] Why can't I do "{$a::class}"?

2024-05-19 Thread Ilija Tovilo
Hi Peter

On Sun, May 19, 2024 at 10:30 AM Peter Stalman  wrote:
>
> echo " {A::$static_property} \n"; // doesn't work (unless $static_property is 
> a variable)
> echo " {$a::$static_property} \n"; // works
>
> echo " {A::static_method()} \n"; // doesn't work (just text)
> echo " {$a::static_method()} \n"; // works
>
> echo " {A::constant} \n"; // doesn't work
> echo " {$a::constant} \n"; // doesn't work either, but why?

It would be straightforward to allow all expressions that start with a
`$` in string interpolation, as I've noticed a couple of years ago
[1]. This restriction seems rather arbitrary.

I think I held off proposing this because I was planning on proposing
a more complete form of string interpolation [2]. However, the
backwards-compatible syntax was largely disliked, so I withdrew the
RFC.

I wasn't particularly fond of introducing more forms of strings (e.g.
$"", f"", etc.) to avoid the BC break, because there are already
plentiful:

* ''
* ""
* `` (yes, these allow interpolation)
* <

Re: [PHP-DEV] [DISCUSSION] Checking uninitialized class properties

2024-05-17 Thread Ilija Tovilo
Hi Luigi

On Fri, May 17, 2024 at 11:40 PM Luigi Cardamone
 wrote:
>
> Here is an example to describe my problem. Imagine a simple
> DTO like this:
>
> class MyDTO{
> public ?int $propA;
> public ?int $propB;
> }
>
> Imagine that a Form processor or a generic mapper fill some of
> these fields with a null value:
>
> $dto = new MyDTO();
> $dto->propA = null;
>
> Sometimes we use DTOs to handle PATCH requests and not all
> the properties are mapped with a value. In a scenario like this,
> "null" is often a valid value.
>
> At this point, I need a way to find if a property was initialized or
> not but unfortunately "isset" is not a solution. When I write:
>
> echo isset($dto->propA) ? 'init' : 'not-init';

IMO, "uninitialized" should not be treated as a value. Code outside of
the constructor should generally not have to deal with uninitialized
properties.

Instead, make sure the constructor leaves your object fully
initialized, in this case likely by setting it to null. If you need
some additional "undefined" state, I think that's better modeled in
other ways, maybe by wrapping it in an object. ADTs [1] should help
with that in the future.

enum Age {
case Unknown;
case Unborn;
case Years(int $years);
}
public Age $age;

Or:

enum Age {
case Unborn;
case Years(int $years);
}
public ?Age $age;

Silly example, but you get the point.

Ilija

[1] https://wiki.php.net/rfc/tagged_unions


Re: [PHP-DEV] [RFC] [Vote] Type Guards for Classes

2024-05-16 Thread Ilija Tovilo
Hi Patrik

On Thu, May 16, 2024 at 10:31 PM Patrik Václavek  wrote:
>
> Introduce a new type guard syntax for classes:
>
> ```php
> (Foo) $variable;
> ```
>
> This syntax will internally perform the following operations:
>
> 1. Check if `$variable` is an instance of `Foo`.
> 2. If the check fails, throw a `TypeError` with a message indicating the 
> expected and actual types.

Note that this feature is covered under the pattern matching RFC with
the "Throwing alternative" extension.
https://wiki.php.net/rfc/pattern-matching#throwing_alternative

In addition to just class types, it would support all other patterns,
including scalar types, union types, array shapes, etc.

Ilija


[PHP-DEV] Introduce ReflectionConstant

2024-04-15 Thread Ilija Tovilo
Hi everyone

We recently received a feature request to allow reflection of global
constants, primarily to check whether they are deprecated.
https://github.com/php/php-src/issues/13570

I created a simple PR to introduce a minimal `ReflectionConstant` class.
https://github.com/php/php-src/pull/13669

If there are no objections, I'd like to merge this PR in a few days.

Ilija


Re: [PHP-DEV] [RFC][Vote announcement] Property hooks

2024-04-14 Thread Ilija Tovilo
Hi Matthew

We're going to skip over the reiterations of Juliettes arguments, as
they have already been responded to.

On Thu, Apr 11, 2024 at 12:08 AM Matthew Weier O'Phinney
 wrote:
>
> On Mon, Apr 8, 2024 at 4:41 PM Ilija Tovilo  wrote:
>>
>> https://externals.io/message/122445#122667
>>
>
> 2. I'm not a huge fan of the short syntax, but the improvements in the most 
> recent draft are _mostly_ ones I can live with. The part that's still unclear 
> is when and where hooks need a `;` termination, and _why_ the `;` is used, 
> instead of `,`. When using `match()` expressions, you use `,` to separate the 
> expressions, but for some reason, the proposal uses `;` ... but only when 
> using short expressions. And if you have a full-form mixed with a short form, 
> the `;` is only needed for the short-form expression. This feels arbitrary, 
> and it will be easy to get it wrong for people comfortable with `match()` 
> statements.

I agree that the distinction of `,` and `;` isn't clear-cut. I would
categorize hooks as declarations, because they are really just
functions attached to the property. Declarations are `;`-terminated,
or have a body (`{}`) (properties, (non-)abstract methods, class
consts, etc). `,`, on the other hand, is used for lists of expressions
and other things (array elements, match cases, function arguments).

The other argument for the syntax we chose is that it's already used in C#.

> (Larry tells me that it's `match()` being weird here, but considering that 
> for many developers, their only point of reference for this sort of syntax IS 
> `match()`, making it feel like the language is ignoring its own syntax when 
> creating new syntax.)

Given my description from above, I don't believe this is true either.
Match arms most closely relate to the array element syntax, which
already uses `,`.

> 3. While I'd likely prefer Marco's approach to references (just don't allow 
> them), the fact that they mirror how `__get()` and `__set()` _currently_ work 
> gives a migration path for users who are familiar with that paradigm's 
> gotchas. In other words, it's consistent with the current language, and will 
> make migrating from `__set/__get` to hooks easier. It's a lot of complexity, 
> but the table you created helps with that. That table MUST make it to the 
> docs for the feature!

Precisely. Our decision to support references (as much as possible)
comes from the desire to maintain compatibility (as much as possible),
not only with `__get`/`__set` but also plain props.

Ilija


Re: [PHP-DEV] [RFC][Vote announcement] Property hooks

2024-04-11 Thread Ilija Tovilo
Hi Juliette

On 9-4-2024 16:03, Juliette Reinders Folmer wrote:
> On 8-4-2024 23:39, Ilija Tovilo wrote:
>>
>> https://wiki.php.net/rfc/property-hooks
>>
>
> I realize it is late in the discussion period to speak up, but for months 
> I've been trying to find the words to express my concerns in a polite and 
> constructive way and have failed.

First of all, thank you for taking the time to analyze the RFC and
form your thoughts. I know how much time and effort it takes.
Nonetheless, I hope you understand that receiving feedback of this
sort literally minutes before a vote is planned is close to the worst
way to go about it, both from a practical standpoint, but also RFC
author and reviewer morale.

Many of the points you raise have been there for many months to a
year, and many have been raised and discussed many times over. I will
try to address each, and give people some time to respond.

> And not just one syntax, but five and the differences in the semantics of the 
> function syntaxes are significant.

I think this is somewhat of a misrepresentation. For reference, you
can find the grammar here. It's only about <50 lines.

https://github.com/php/php-src/blob/bf390b47c7522fd4d130be26b8ee97520f985275/Zend/zend_language_parser.y#L1088-L1131

Notably, `get` and `set` hooks don't actually have a different
grammar. Instead, hooks have names, optional parameter lists, and an
optional short (`=>`) or long (`{}`) body. Whether all possible
combinations are considered different syntax is open to
interpretation. It's true that additional rules apply to each hook,
e.g. whether they are allowed to declare a parameter list, and how the
return value is used. I could see an argument being made for allowing
an empty parameter list (`()`) for `get`, for consistency. Let us know
if you have any other ideas to make them more congruent.

In any case, I don't believe PHP_CodeSniffer would need to handle each
combination separately.

What would your optimal solution look like? I feel this discussion
cannot progress unless we have more concrete discussion points.

> * The implicit "set" parameter, which does not have a explicit parameter, 
> does not have parentheses and has the magically created $value variable.

To be a bit more exact, the parameter list is simply inferred to
`( $value)` so that the user does not need to spell it
out. This is not akin to other magic variables we had previously, like
`$http_response_header`.

> TL;DR: this RFC tries to do too much in one go and introduces a huge amount 
> of cognitive complexity with all the exceptions and the differences in 
> behaviour between virtual and backed properties. This cognitive complexity is 
> so high that I expect that the feature will catch most developers out a lot 
> of the time.

Many people still say that this RFC doesn't do enough because it
doesn't support x/y/z. This is including you:

> * The arrow function variant for `set` with all the above differences + the 
> differences inherent to arrow functions with the above mentioned exceptions + 
> the implicit assignment, which breaks the expected behaviour of arrow 
> functions by assigning the result of the expression instead of returning it 
> (totally understandable, but still a difference).
> * Properties _may_ or _may not_ have a default value anymore, depending on 
> whether the hooks cause it to be a backed or a virtual property.
> * Properties with hooks can _be_ readonly, but cannot be declared as readonly.
> * Only available for object properties, static properties are excluded.
> * Readonly properties which are not denoted as readonly, but still are due to 
> their virtual nature (get without access to $this->prop), which can only be 
> figured out by, again, studying the contents of the hook function.

There are more below that I will be commenting on in more detail.

I understand that, in a perfect world, each language feature would
automatically work with every other. In reality, this is either not
possible, or complex and time consuming. We opted for the middle
ground, allowing things that were workable and omitting things that
weren't, or required unrelated changes. If this is not acceptable,
then I genuinely don't know how to approach non-trivial RFCs.

> * Properties can now be declared as abstract, but only with explicit hook 
> requirements, otherwise the abstract keyword is not allowed.
> * Properties can now be declared on interfaces, but only with explicit hook 
> requirements, otherwise they are not allowed.

The semantics of plain interface properties are not clear.
Intuitively, `public $prop;` looks equivalent to `public $prop { get;
set; }`, but due to reference semantics it is not. It's not even
equivalent to `{  set; }`, because by-reference hooked properties
still have some inherent limitations
(https://wiki.php.net/rfc/property-hooks#assignm

Re: [PHP-DEV] [RFC][Vote announcement] Property hooks

2024-04-09 Thread Ilija Tovilo
Hi Robert

On Tue, Apr 9, 2024 at 9:34 PM Robert Landers  wrote:
>
> On Tue, Apr 9, 2024 at 8:56 PM Larry Garfield  wrote:
> >
> > The Aviz RFC was put to a vote last year but didn't pass.
>
> It would be really nice if votes weren't just a yes/no vote, but
> yes/needs-more-work/no vote, where needs-more-work and no are
> effectively the same in terms of passing the RFC, but needs-more-work
> just means there is more to do (either addressing ugly syntax or the
> idea is sound, but as it says, it needs more work), and can thus be
> simply revoted on after concerns are addressed -- instead of creating
> a whole new RFC that needs to pass the "not too similar to other RFCs
> rule."

The asymmetric visibility RFC did include a poll for no votes.
https://wiki.php.net/rfc/asymmetric-visibility#proposed_voting_choices

> I got the impression from the Aviz discussions that most people were
> against Aviz due to the syntax, not the feature itself. It would be
> absolutely tragic if this failed to pass simply because people
> expected Aviz here.

According to the poll, syntax was one, but not the primary reason for
its rejection. The primary reason was that some people don't believe
the feature is necessary at all.

IIRC, people were arguing that readonly covers 80% of use-cases,
because it protects against writes to the property both publicly and
privately. I don't agree with this viewpoint, because I think readonly
is bad for ergonomics. In fact, we already had an RFC that attempted
to fix clone for readonly
(https://wiki.php.net/rfc/readonly_amendments) but this fix was not
complete (because it's still not possible to pass values from clone to
__clone). "Clone with" is another thing needed to fix this, and at
this point it just feels like applying more band-aids.

For DTOs, I believe value types (i.e. data classes,
https://externals.io/message/122845) solve the problem of "spooky
actions at a distance" in a cleaner and more ergonomic way.

For services and other intentional reference types, readonly often
isn't the right choice either, just to make the property not publicly
writable. Asymmetric visibility would be a much more fitting choice.

Anyway, we didn't include asymmetric visibility in this RFC because:

* We wanted to avoid getting rejected by people who fundamentally
dislike asymmetric visibility.
* We didn't feel it was fair to "sneak" the feature back in through
some other RFC, when it was explicitly rejected.

Instead, we are planning to re-propose asymmetric visibility once
property hooks are merged, as it may become more apparent why it is
useful.

Ilija


[PHP-DEV] [RFC][Vote announcement] Property hooks

2024-04-08 Thread Ilija Tovilo
Hi everyone

Heads-up: Larry and I would like to start the vote of the property
hooks RFC tomorrow:
https://wiki.php.net/rfc/property-hooks

We have worked long and hard on this RFC, and hope that we have found
some middle-ground that works for the majority. One last concern we
have not officially clarified on the list:

https://externals.io/message/122445#122667

>> I personally do not feel strongly about whether asymmetric types make it 
>> into the initial implementation. Larry does, however, and I think it is not 
>> fair to exclude them without providing any concrete reasons not to. [snip]
>
> My concern is more about the external impact of what is effectively a change 
> to the type system of the language: [snip] will tools like PhpStan and Psalm 
> require complex changes to analyse code using such properties?

In particular, this paragraph is referencing the ability to widen the
accepted $value parameter type of the set hook, described at the
bottom of https://wiki.php.net/rfc/property-hooks#set. I have talked
to Ondřej Mirtes, the maintainer of PHPStan, and he confirmed that
this should not be complex to implement in PHPStan. In fact, PHPStan
already offers the @property-read and @property-write class
annotations, which can be used to describe "virtual" properties
handled within __get/__set, already providing asymmetric types of
sorts. Hence, this concern should be a non-issue.

Thank you to everybody who has contributed to the discussion!

Ilija


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-06 Thread Ilija Tovilo
Hi Rowan

On Fri, Apr 5, 2024 at 12:28 AM Rowan Tommins [IMSoP]
 wrote:
>
> On 03/04/2024 00:01, Ilija Tovilo wrote:
>
> Regardless of the implementation, there are a lot of interactions we will 
> want to consider; and we will have to keep considering new ones as we add to 
> the language. For instance, the Property Hooks RFC would probably have needed 
> a section on "Interaction with Data Classes".

That remark was implying that data classes really are just classes
with some additional tweaks. That gives us the ability to handle them
differently when desired. However, they will otherwise behave just
like classes, which makes it not so different from your suggestion.

> On a practical note, a few things I've already thought of to consider:
>
> - Can a data class have readonly properties (or be marked "readonly data 
> class")? If so, how will they behave?

Yes. The CoW semantics become irrelevant, given that nothing may
trigger a separation. However, data classes also include value
equality, and hashing in the future. These may still be useful for
immutable data.

> - Can you explicitly use the "clone" keyword with an instance of a data 
> class? Does it make any difference?

Manual cloning is not useful, but it's also not harmful. So I'm
leaning towards allowing this. This way, data classes may be handled
generically, along with other non-data classes.

> - Tied into that: can you implement __clone(), and when will it be called?

Yes. `__clone` will be called when the object is separated, as you would expect.

> - If you implement __set(), will copy-on-write be triggered before it's 
> called?

Yes. Separation happens as part of the property fetching, rather than
the assignment itself. Hence, for `$foo->bar->baz = 'baz';`, once
`Bar::__set('baz', 'baz')` is called, `$foo` and `$foo->bar` will
already have been separated.

> - Can you implement __destruct()? Will it ever be called?

Yes. As with any other object, this will be called once the last
reference to the object goes away. There's nothing special going on.

It's worth noting that CoW makes `__clone` and `__destruct` somewhat
nondeterministic, or at least non-obvious.

> > Consider this example, which would > work with the current approach: > > 
> > $shapes[0]->position->zero!();
>
> I find this concise example confusing, and I think there's a few things to 
> unpack here...

I think you're putting too much focus on CoW. CoW should really be
considered an implementation detail. It's not _fully_ transparent,
given that it is observable through `__clone` and `__destruct` as
mentioned above. But it is _mostly_ transparent.

Conceptually, the copy happens not when the method is called, but when
the variable is assigned. For your example:

```php
$shape = new Shape(new Position(42,42));
$copy = $shape; // Conceptually, a recursive copy happens here.
$copy->position->zero!(); // $shape is already detached from $copy.
The ! merely indicates that the value is modified.
```

> The array access doesn't need any special marker, because there's no 
> ambiguity.

This is only true if you ignore ArrayAccess. `$foo['bar']` does not
necessarily indicate that `$foo` is an array. If it were a `Vector`,
then we would absolutely need an indication to separate it.

It's true that `$foo->bar` currently indicates that `$foo` is a
reference type. This assumption would break with this RFC, but that's
also kind of the whole point.

> What is going to be CoW cloned, and what is going to be modified in place? I 
> can't actually know without knowing the definition behind both $item and 
> $item->shape. It might even vary depending on input.

For the most part, data classes should consist of other value types,
or immutable reference types (e.g. DateTimeImmutable). This actually
makes the rules quite simple: If you assign a value type, the entire
data structure is copied recursively. The fact that PHP delays this
step for performance is unimportant. The fact that immutable reference
types aren't cloned is also unimportant, given that they don't change.

Ilija


Re: [PHP-DEV] Proposal: retrieve line, filename and if user defined for ReflectionAttribute

2024-04-05 Thread Ilija Tovilo
Hi Joel

On Fri, Apr 5, 2024 at 3:10 PM Joel Wurtz  wrote:
>
> Like a lot of libraries, we offer the possibility to configure behaviors with 
> Attributes. However in some cases it's wrongly configured by the user and 
> this wrong configuration cannot be detected on the attribute constructor but 
> afterwards.
>
> In this case we may want to pinpoint which attribute (in which file and at 
> which line) cause this bad configuration. Since there was no method to 
> retrieve those information in the ReflectionAttribute I proposed a PR 
> https://github.com/php/php-src/pull/13889 to add those informations.
>
> I do believe this will allow better DX for end user when correctly used,

I would propose negating and renaming isUserDefined() to isInternal(),
since we already have several such methods. As hinted in the PR, I
believe the implementation is wrong, or rather doesn't do what you
would expect it to do.

I'm also wondering whether it may be more useful to expose the
attribute class as getClass(), without instantiating the attribute
itself. This would allow you to check for isInternal() there, along
with all the other class reflection information.

Ilija


Re: [PHP-DEV] RFC idea: using the void type to control maximum arity of user-defined functions

2024-04-04 Thread Ilija Tovilo
On Thu, Apr 4, 2024 at 5:58 PM Tim Düsterhus  wrote:
>
> On 4/4/24 16:36, Pablo Rauzy wrote:
> > I strongly agree in theory, but this could break existing code, and
> > moreover such a proposal was already rejected:
> > https://wiki.php.net/rfc/strict_argcount
>
> The RFC is 9 years old by now. My gut feeling is be that using an actual
> variadic parameter for functions that are variadic is what people do,
> because it makes the function signature much clearer. Actually variadic
> parameters are available since PHP 5.6, which at the time of the
> previous RFC was the newest version. Since then we had two major
> releases, one of which (7.x) is already out of support.
>
> I think it would be reasonable to consider deprecating passing extra
> arguments to a non-variadic function.

IIRC one of the bigger downsides of this change are closure calls that
may provide arguments that the callee does not care about.

https://3v4l.org/0QdoS

```
function filter($array, callable $c) {
$result = [];
foreach ($array as $key => $value) {
if ($c($value, $key)) {
$result[$key] = $value;
}
}
return $result;
}

var_dump(filter(['foo', '', 'bar'], function ($value) {
return strlen($value);
}));

// Internal functions already throw on superfluous args
var_dump(filter(['foo', '', 'bar'], 'strlen'));
```

The user may currently choose to omit the $key parameter of the
closure, as it is never used. In the future, this would throw. We may
decide to create an exemption for such calls, but I'm not sure
replacing one inconsistency with another is a good choice.

Ilija


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-03 Thread Ilija Tovilo
Hi Larry

On Wed, Apr 3, 2024 at 12:03 AM Larry Garfield  wrote:
>
> On Tue, Apr 2, 2024, at 6:04 PM, Ilija Tovilo wrote:
>
> > I think you misunderstood. The intention is to mark both call-site and
> > declaration. Call-site is marked with ->method!(), while declaration
> > is marked with "public mutating function". Call-site is required to
> > avoid the engine complexity, as previously mentioned. But
> > declaration-site is required so that the user (and IDEs) even know
> > that you need to use the special syntax at the call-site.
>
> Ah, OK.  That's... unfortunate, but I defer to you on the implementation 
> complexity.

As I've argued, I believe the different syntax is a positive. This
way, data classes are known to stay unmodified unless:

1. You're explicitly modifying it yourself.
2. You're calling a mutating method, with its associated syntax.
3. You're creating a reference from the value, either explicitly or by
passing it to a by-reference parameter.

By-reference argument passing is the only way that mutations of data
classes can be hidden (given that they look exactly like normal
by-value arguments), and its arguably a flaw of by-reference passing
itself. In all other cases, you can expect your value _not_ to
unexpectedly change. For this reason, I consider it as an alternative
approach to readonly classes.

> > Disallowing ordinary by-ref objects is not trivial without additional
> > performance penalties, and I don't see a good reason for it. Can you
> > provide an example on when that would be problematic?
>
> There's two aspects to it, that I see.
>
> data class A {
>   public function __construct(public string $name) {}
> }
>
> data class B {
>   public function __construct(
> public A $a,
> public PDO $conn,
>   ) {}
> }
>
> $b = new B(new A(), $pdoConnection);
>
> function stuff(B $b2) {
>   $b2->a->name = 'Larry';
>   // This triggers a CoW on $b2, separating it from $b, and also creating a 
> new instance of A.  What about $conn?
>   // Does it get cloned?  That would be bad.  Does it not get cloned?  That 
> seems weird that it's still the same on
>   // a data object.
>
>   $b2->conn->beginTransaction();
>   // This I would say is technically a modification, since the state of the 
> connection is changing.  But then
>   // should this trigger $b2 cloning from $b1?  Neither answer is obvious to 
> me.
> }

IMO, the answer is relatively straight-forward: PDO is a reference
type. For all intents and purposes, when you're passing B to stuff(),
B is copied. Since B::$conn is a "reference" (read pointer), copying B
doesn't copy the connection, only the reference to it. B::$a, however,
is a value type, so copying B also copies A. The fact that this isn't
_exactly_ what happens under the hood due to CoW is an implementation
detail, it doesn't need to change how you think about it. From the
users standpoint, $b and $b2 can already separate values once stuff()
is called.

This is really no different from arrays:

```php
$b = ['a' => ['name' => 'Larry'], 'conn' => $pdoConnection];
$b2 = $b; // $b is detached from $b2, $b['conn'] remains a shared object.
```

> The other aspect is, eg, serialization.  People will come to expect 
> (reasonably) that a data class will have certain properties (in the abstract 
> sense, not lexical sense).  For instance, most classes are serializable, but 
> a few are not.  (Eg, if they have a reference to PDO or a file handle or 
> something unserializable.)  Data classes seem like they should be safe to 
> serialize always, as they're "just data".  If data classes are limited to 
> primitives and data classes internally, that means we can effectively 
> guarantee that they will be serializable, always.  If one of the properties 
> could be a non-serializable object, that assumption breaks.

I'm not sure that's a convincing argument to fully disallow reference
types, especially since it would prevent you from storing
DateTimeImmutables and other immutable values in data classes and thus
break many valid use-cases. That would arguably be very limiting.

> There's probably other similar examples besides serialization where "think of 
> this as data" and "think of this as logic" is how you'd want to think, which 
> leads to different assumptions, which we shouldn't stealthily break.

I think your assumption here is that non-data classes cannot contain
data. This doesn't hold, and especially will not until data classes
become more common. Readonly classes can be considered strict versions
of data classes in terms of mutability, minus some of the other
semantic changes (e.g. identity).

Ilija


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Ilija Tovilo
Hi Rowan

On Tue, Apr 2, 2024 at 10:10 PM Rowan Tommins [IMSoP]
 wrote:
>
> On 02/04/2024 01:17, Ilija Tovilo wrote:
>
> I'd like to introduce an idea I've played around with for a couple of
> weeks: Data classes, sometimes called structs in other languages (e.g.
> Swift and C#).
>
> I'm not sure if you've considered it already, but mutating methods should 
> probably be constrained to be void (or maybe "mutating" could occupy the 
> return type slot). Otherwise, someone is bound to write this:
>
> $start = new Location('Here');
> $end = $start->move!('There');
>
> Expecting it to mean this:
>
> $start = new Location('Here');
> $end = $start;
> $end->move!('There');
>
> When it would actually mean this:
>
> $start = new Location('Here');
> $start->move!('There');
> $end = $start;

I think there are some valid patterns for mutating methods with a
return value. For example, Set::add() might return a bool to indicate
whether the value was already present in the set.

> I seem to remember when this was discussed before, the argument being made 
> that separating value objects completely means you have to spend time 
> deciding how they interact with every feature of the language.

Data classes are classes with a single additional
zend_class_entry.ce_flags flag. So unless customized, they behave as
classes. This way, we have the option to tweak any behavior we would
like, but we don't need to.

Of course, this will still require an analysis of what behavior we
might want to tweak.

> Does the copy-on-write optimisation actually require the entire class to be 
> special, or could it be triggered by a mutating method on any object? To 
> allow direct modification of properties as well, we could move the call-site 
> marker slightly to a ->! operator:
>
> $foo->!mutate();
> $foo->!bar = 42;

I suppose this is possible, but it puts the burden for figuring out
what to separate onto the user. Consider this example, which would
work with the current approach:

$shapes[0]->position->zero!();

The left-hand-side of the mutating method call is fetched by
"read+write". Essentially, this ensures that any array or data class
is separated (copied if RC >1).

Without such a class-wide marker, you'll need to remember to add the
special syntax exactly where applicable.

$shapes![0]!->position!->zero();

In this case, $shapes, $shapes[0], and $shapes[0]->position must all
be separated. This seems very easy to mess up, especially since only
zero() is actually known to be separating and can thus be verified at
runtime.

> The main drawback I can see (outside of the implementation, which I can't 
> comment on) is that we couldn't overload the === operator to use value 
> semantics. In exchange, a lot of decisions would simply be made for us: they 
> would just be objects, with all the same behaviour around inheritance, 
> serialization, and so on.

Right, this would either require some other marker that switches to
this mode of comparison, or operator overloading.

Ilija


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Ilija Tovilo
Hi Niels

On Tue, Apr 2, 2024 at 8:16 PM Niels Dossche  wrote:
>
> On 02/04/2024 02:17, Ilija Tovilo wrote:
> > Hi everyone!
> >
> > I'd like to introduce an idea I've played around with for a couple of
> > weeks: Data classes, sometimes called structs in other languages (e.g.
> > Swift and C#).
>
> As already hinted in the thread, I also think inheritance may be dangerous in 
> a first version.
> I want to add to that: if you extend a data-class with a non-data-class, the 
> data-class behaviour gets lost, which is logical in a sense but also 
> surprised me in a way.

Yes, that's definitely not intended. I haven't implemented any
inheritance checks yet. But if inheritance is allowed, then it should
be restricted to classes of the same kind (by-ref or by-val).

> Also, FWIW, I'm not sure about the name "data" class, perhaps "value" class 
> or something alike is what people may be more familiar with wrt semantics, 
> although dataclass is also a known term.

I'm happy with value class, struct, record, data class, what have you.
I'll accept whatever the majority prefers.

> I do have a question about iterator behaviour. Consider this code:
> ```
> data class Test {
> public $a = 1;
> public $b = 2;
> }
>
> $test = new Test;
> foreach ($test as $k => &$v) {
> if ($k === "b")
> $test->a = $test;
> var_dump($k);
> }
> ```
>
> This will reset the iterator of the object on separation, so we will get an 
> infinite loop.
> Is this intended?
> If so, is it because the right hand side is the original object while the 
> left hand side gets the clone?
> Is this consistent with how arrays separate?

That's a good question. I have not really thought about iterators yet.
Modification of an array iterated by-reference does not restart the
iterator. Actually, by-reference capturing of the value also captures
the array by-reference, which is not completely intuitive.

My initial gut feeling is to handle data classes the same, i.e.
capture them by-reference when iterating the value by reference, so
that iteration is not restarted.

Ilija


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Ilija Tovilo
On Tue, Apr 2, 2024 at 9:43 PM Rowan Tommins [IMSoP]
 wrote:
>
> Similarly, if you discover a compromised key or signing account, you can look 
> for uses of that key or account, which might be a tiny number from a non-core 
> contributor; if you discover a compromised account pushing unsigned commits, 
> you have to audit every commit in the repository.

Right, that and what Jakub mentioned are fair arguments.

> I agree it's not a complete solution, but no security measure is; it's always 
> about reducing the attack surface or limiting the damage.

Right. That was the original intention of my e-mail: To point out that
we might also want to consider other mitigations. Not that we
shouldn't do commit signing.

Ilija


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Ilija Tovilo
Hi Rowan

On Tue, Apr 2, 2024 at 8:48 PM Rowan Tommins [IMSoP]
 wrote:
>
> In fact, you don't need to compromise anybody's key: you could socially 
> engineer a situation where you have push access to the repository, or break 
> the security in some other way. As I understand it, this is exactly what 
> happened 3 years ago: someone gained direct write access to the git.php.net 
> server, and added commits "authored by" Nikita and others to the history in 
> the repository.

Right, but I would like to believe that attaining push access _without
gaining access to a maintainers account_ should be substantially
harder on GitHub than our self-hosted git server. :)

> If all commits are signed, a compromised key or account can only be used to 
> sign commits with that specific identity: your GitHub account can't be used 
> to sign commits as Derick or Nikita, only as you. The impact is limited to 
> one identity, not the integrity of the entire repository.

But, does it matter? I'm not sure we look at some commits closer than
others, based on its author. It's true that it might be easier to
identify malicious commits if they all come from the same user, but it
wouldn't prevent them.

To be clear: I'm not against commit signing, I've been doing it for
years. I'm just unsure if it's a sufficient solution (apart from
releases, which are a whole different can of worms).

Ilija


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Ilija Tovilo
Hi Larry

On Tue, Apr 2, 2024 at 5:31 PM Larry Garfield  wrote:
>
> On Tue, Apr 2, 2024, at 12:17 AM, Ilija Tovilo wrote:
> > Hi everyone!
> >
> > I'd like to introduce an idea I've played around with for a couple of
> > weeks: Data classes, sometimes called structs in other languages (e.g.
> > Swift and C#).
> >
> > * Data classes are ordinary classes, and as such may implement
> > interfaces, methods and more. I have not decided whether they should
> > support inheritance.
>
> What would be the reason not to?  As you indicated in another reply, the main 
> reason some languages don't is to avoid large stack copies, but PHP doesn't 
> have large stack copies for objects anyway so that's a non-issue.
>
> I've long argued that the fewer differences there are between service classes 
> and data classes, the better, so I'm not sure what advantage this would have 
> other than "ugh, inheritance is such a mess" (which is true, but that ship 
> sailed long ago).

One issue that just came to mind is object identity. For example:

class Person {
public function __construct(
public string $firstname,
public string $lastname,
) {}
}

class Manager extends Person {
public function bossAround() {}
}

$person = new Person('Boss', 'Man');
$manager = new Manager('Boss', 'Man');
var_dump($person === $manager); // ???

Equality for data objects is based on data, rather than the object
handle. How does this interact with inheritance? Technically, Person
and Manager represent the same data. Manager contains additional
behavior, but does that change identity?

I'm not sure what the answer is. That's just the first thing that came
to mind. I'm confident we'll discover more such edge cases. Of course,
I can invest the time to find the questions before deciding to
disallow inheritance.

> > * Mutating method calls on data classes use a slightly different
> > syntax: `$vector->append!(42)`. All methods mutating `$this` must be
> > marked as `mutating`. The reason for this is twofold: 1. It signals to
> > the caller that the value is modified. 2. It allows `$vector` to be
> > cloned before knowing whether the method `append` is modifying, which
> > hugely reduces implementation complexity in the engine.
>
> As discussed in R11, it would be very beneficial if this marker could be on 
> the method definition, not the method invocation.  You indicated that would 
> be Hard(tm), but I think it's worth some effort to see if it's surmountably 
> hard.  (Or at least less hard than just auto-detecting it, which you 
> indicated is Extremely Hard(tm).)

I think you misunderstood. The intention is to mark both call-site and
declaration. Call-site is marked with ->method!(), while declaration
is marked with "public mutating function". Call-site is required to
avoid the engine complexity, as previously mentioned. But
declaration-site is required so that the user (and IDEs) even know
that you need to use the special syntax at the call-site.

> So to the extent there is a consensus, equality, stringifying, and a hashcode 
> (which we don't have yet, but will need in the future for some things I 
> suspect) seem to be the rough expected defaults.

I'm just skeptical whether the default __toString() is ever useful. I
can see an argument for it for quick debugging in languages that don't
provide something like var_dump(). In PHP this seems much less useful.
It's impossible to provide a default implementation that works
everywhere (or pretty much anywhere, even).

Equality is already included. Hashing should be added separately, and
probably not just to data classes.

> > * In the future, it should be possible to allow using data classes in
> > `SplObjectStorage`. However, because hashing is complex, this will be
> > postponed to a separate RFC.
>
> Would data class properties only be allowed to be other data classes, or 
> could they hold a non-data class?  My knee jerk response is they should be 
> data classes all the way down; the only counter-argument I can think of it 
> would be how much existing code is out there that is a "data class" in all 
> but name.  I still fear someone adding a DB connection object to a data class 
> and everything going to hell, though. :-)

Disallowing ordinary by-ref objects is not trivial without additional
performance penalties, and I don't see a good reason for it. Can you
provide an example on when that would be problematic?

Ilija


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Ilija Tovilo
Hi Derick

On Tue, Apr 2, 2024 at 4:15 PM Derick Rethans  wrote:
>
> What do y'all think about requiring GPG signed commits for the php-src
> repository?

Let me repost my internal response for visibility.

I'm currently struggling to understand what kind of attack signing
commits prevents.

If your GitHub account is compromised, GitHub allows the attacker to
commit via web interface and will happily sign their commits with a
gpg key auto-generated for your account.

See: 
https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification

> GitHub will automatically use GPG to sign commits you make using the web 
> interface. Commits signed by GitHub will have a verified status. You can 
> verify the signature locally using the public key available at 
> https://github.com/web-flow.gpg.

Even if this wasn't the case, the attacker may simply register their
own gpg key in your account, with the commits appearing as verified.

If your ssh key is compromised instead, and you use ssh to sign your
commits, the attacker may sign their malicious commits with that same
key they may use to push.

The only thing this really seems to prevent is pushing commits via a
compromised ssh key, while commits need to be signed with gpg. If
that's the intention, we should require using gpg rather than ssh for
signing (or using a different ssh key, I suppose). Additionally, it
may help for people who push via HTTP+auth token, but that's probably
not advisable in the first place.

Something that may also help is restricting pushes to patch branches
(PHP-x.y.z) to release managers. These branches are not commonly
looked at by the public, and so it may be easier to sneak malicious
commits into them.

In addition, we should keep GitHub privileges narrow, especially
branch protection configuration.

As mentioned by others, this does not prevent the xz issue. But paired
with an auto-deployment solution, it could definitely help. It would
be even better if release managers cannot change CI, and CI
maintainers cannot create releases, as this essentially enforces the
4-eyes principle. The former may be hard to enforce, as CI lives in
the same repository.

Another solution might be to require PRs, and PR verifications. But
this will inevitably create overhead for maintainers.

Ilija


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Ilija Tovilo
Hi Alexander

On Tue, Apr 2, 2024 at 4:53 AM Alexander Pravdin  wrote:
>
> On Tue, Apr 2, 2024 at 9:18 AM Ilija Tovilo  wrote:
> >
> > I'd like to introduce an idea I've played around with for a couple of
> > weeks: Data classes, sometimes called structs in other languages (e.g.
> > Swift and C#).
>
> While I like the idea, I would like to suggest something else in
> addition or as a separate feature. As an active user of readonly
> classes with all promoted properties for data-holding purposes, I
> would be happy to see the possibility of cloning them with passing
> some properties to modify:
>
> readonly class Data {
> function __construct(
> public string $foo,
> public string $bar,
> public string $baz,
> ) {}
> }
>
> $data = new Data(foo: 'A', bar: 'B', baz: 'C');
>
> $data2 = clone $data with (bar: 'X', baz: 'Y');

What you're asking for is part of the "Clone with" RFC:
https://wiki.php.net/rfc/clone_with

This issue is valid and the RFC would improve the ergonomics of
readonly classes.

However, note that it really only addresses a small part of what this
RFC tries achieve:

> Some APIs further exacerbate the issue by
requiring multiple copies for multiple modifications (e.g.
`$response->withStatus(200)->withHeader('X-foo', 'foo');`).

Readonly works fine for compact data structures, even if it is copied
more than it needs. For large data structures, like large lists, a
copy for each modification would be detrimental.

https://3v4l.org/GR6On

See how the performance of an insert into an array tanks if a copy of
the array is performed in each iteration (due to an additional
reference to it). Readonly is just not viable for data structures such
as lists, maps, sets, etc.

Ilija


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Ilija Tovilo
Hi Marco

On Tue, Apr 2, 2024 at 2:56 AM Deleu  wrote:
>
>
>
> On Mon, Apr 1, 2024 at 9:20 PM Ilija Tovilo  wrote:
>>
>> I'd like to introduce an idea I've played around with for a couple of
>> weeks: Data classes, sometimes called structs in other languages (e.g.
>> Swift and C#).
>>
>> snip
>>
>> Some other things to note about data classes:
>>
>> * Data classes are ordinary classes, and as such may implement
>> interfaces, methods and more. I have not decided whether they should
>> support inheritance.
>
> I'd argue in favor of not including inheritance in the first version. Taking 
> inheritance out is an impossible BC Break. Not introducing it in the first 
> stable release gives users a chance to evaluate whether it's something we 
> will drastically miss.

I would probably agree. I believe the reasoning some languages don't
support inheritance for value types is because they are stored on the
stack. Inheritance encourages large structures, but copying very large
structures over and over on the stack may be slow.

In PHP, objects always live on the heap, and due to CoW we don't have
this problem. Still, it may be beneficial to disallow inheritance
first, and relax this restriction if it is necessary.

>> * Mutating method calls on data classes use a slightly different
>> syntax: `$vector->append!(42)`. All methods mutating `$this` must be
>> marked as `mutating`. The reason for this is twofold: 1. It signals to
>> the caller that the value is modified. 2. It allows `$vector` to be
>> cloned before knowing whether the method `append` is modifying, which
>> hugely reduces implementation complexity in the engine.
>
> I'm not sure if I understood this one. Do you mean that the `!` modifier here 
> (at call-site) is helping the engine clone the variable before even diving 
> into whether `append()` has been tagged as mutating?

Precisely. The issue comes from deeper nested values:

$circle->position->zero();

Imagine that Circle is a data class with a Position, which is also a
data class. Position::zero() is a mutating method that sets the
coordinates to 0:0. For this to work, not only the position needs to
be copied, but also $circle. However, the engine doesn't yet know
ahead of time whether zero() is mutating, and as such needs to perform
a copy.

One idea was to evaluate the left-hand-side of the method call, and
repeat it with a copy if the method is mutating. However, this is not
trivially possible, because opcodes consume their operands. So, for an
expression like `getCircle()->position->zero()`, the return value of
`getCircle()` is already gone. `!` explicitly distinguishes the call
from non-mutating calls, and knows that a copy will be needed.

But as mentioned previously, I think a different syntax offers
additional benefits for readability.

> From outside it looks odd that a clone would happen ahead-of-time while 
> talking about copy-on-write. Would this syntax break for non-mutating methods?

If by break you mean the engine would error, then yes. Only mutating
methods may (and must) be called with the $foo->bar!() syntax.

Ilija


[PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-01 Thread Ilija Tovilo
Hi everyone!

I'd like to introduce an idea I've played around with for a couple of
weeks: Data classes, sometimes called structs in other languages (e.g.
Swift and C#).

In a nutshell, data classes are classes with value semantics.
Instances of data classes are implicitly copied when assigned to a
variable, or when passed to a function. When the new instance is
modified, the original instance remains untouched. This might sound
familiar: It's exactly how arrays work in PHP.

```php
$a = [1, 2, 3];
$b = $a;
$b[] = 4;
var_dump($a); // [1, 2, 3]
var_dump($b); // [1, 2, 3, 4]
```

You may think that copying the array on each assignment is expensive,
and you would be right. PHP uses a trick called copy-on-write, or CoW
for short. `$a` and `$b` actually share the same array until `$b[] =
4;` modifies it. It's only at this point that the array is copied and
replaced in `$b`, so that the modification doesn't affect `$a`. As
long as a variable is the sole owner of a value, or none of the
variables modify the value, no copy is needed. Data classes use the
same mechanism.

But why value semantics in the first place? There are two major flaws
with by-reference semantics for data structures:

1. It's very easy to forget cloning data that is referenced somewhere
else before modifying it. This will lead to "spooky actions at a
distance". Having recently used JavaScript (where all data structures
have by-reference semantics) for an educational IR optimizer,
accidental mutations of shared arrays/maps/sets were my primary source
of bugs.
2. Defensive cloning (to avoid issue 1) will lead to useless work when
the value is not referenced anywhere else.

PHP offers readonly properties and classes to address issue 1.
However, they further promote issue 2 by making it impossible to
modify values without cloning them first, even if we know they are not
referenced anywhere else. Some APIs further exacerbate the issue by
requiring multiple copies for multiple modifications (e.g.
`$response->withStatus(200)->withHeader('X-foo', 'foo');`).

As you may have noticed, arrays already solve both of these issues
through CoW. Data classes allow implementing arbitrary data structures
with the same value semantics in core, extensions or userland. For
example, a `Vector` data class may look something like the following:

```php
data class Vector {
private $values;

public function __construct(...$values) {
$this->values = $values;
}

public mutating function append($value) {
$this->values[] = $value;
}
}

$a = new Vector(1, 2, 3);
$b = $a;
$b->append!(4);
var_dump($a); // Vector(1, 2, 3)
var_dump($b); // Vector(1, 2, 3, 4)
```

An internal Vector implementation might offer a faster and stricter
alternative to arrays (e.g. Vector from php-ds).

Some other things to note about data classes:

* Data classes are ordinary classes, and as such may implement
interfaces, methods and more. I have not decided whether they should
support inheritance.
* Mutating method calls on data classes use a slightly different
syntax: `$vector->append!(42)`. All methods mutating `$this` must be
marked as `mutating`. The reason for this is twofold: 1. It signals to
the caller that the value is modified. 2. It allows `$vector` to be
cloned before knowing whether the method `append` is modifying, which
hugely reduces implementation complexity in the engine.
* Data classes customize identity (`===`) comparison, in the same way
arrays do. Two data objects are identical if all their properties are
identical (including order for dynamic properties).
* Sharing data classes by-reference is possible using references, as
you would for arrays.
* We may decide to auto-implement `__toString` for data classes,
amongst other things. I am still undecided whether this is useful for
PHP.
* Data classes protect from interior mutability. More concretely,
mutating nested data objects stored in a `readonly` property is not
legal, whereas it would be if they were ordinary objects.
* In the future, it should be possible to allow using data classes in
`SplObjectStorage`. However, because hashing is complex, this will be
postponed to a separate RFC.

One known gotcha is that we cannot trivially enforce placement of
`modfying` on methods without a performance hit. It is the
responsibility of the user to correctly mark such methods.

Here's a fully functional PoC, excluding JIT:
https://github.com/php/php-src/pull/13800

Let me know what you think. I will start working on an RFC draft once
work on property hooks concludes.

Ilija


Re: [PHP-DEV] GitHub milestones

2024-03-27 Thread Ilija Tovilo
Hi Jakub

On Wed, Mar 27, 2024 at 4:55 PM Jakub Zelenka  wrote:
>
> We actually decided not to do it for 8.3 because it would be just waste of 
> time to set all PR's with that milestone. The thing is that PR should just 
> get merged when it's ready and we won't be delaying release because some PR's 
> in that milestone are not ready so it does not have any meaning.

Thank you, I was unaware that this was a conscious decision. I agree
that it's not particularly useful for the next minor version. If they
are ready, nothing is blocking a merge to master.

> I'm not really sure if there's any point to have non-draft PR's targeting 
> next major version because they cannot be merged to master until it is 
> decided the next version will be the major one. So those PR's should be draft 
> but it might make sense to create milestone for them to show quickly why they 
> are in draft.

Draft PRs that target the next major version can make sense if they
are part of an RFC. I generally believe that every RFC should have at
least a proof-of-concept. In my experience, the implementation is the
only reliable way to reveal conceptual issues.

But then again, such RFCs should be in the RFC listing, as mentioned
in my previous message. So I agree that there's not a big need to
milestones.

> So I don't think there is much point in adding 8.4 milestone but 9.0 might be 
> useful.

That sounds reasonable to me, as it shouldn't cost much. Milestones
don't need to be complete either, it can be added where it makes
sense, for things that might be forgotten otherwise.

Ilija


Re: [PHP-DEV] Request for RFC karma

2024-03-27 Thread Ilija Tovilo
Hi!

On Wed, Mar 27, 2024 at 12:59 PM 하늘아부지  wrote:
>
> I request RFC karma to discuss the issue at this link.
> https://github.com/php/php-src/issues/13813
>
> Wiki account : daddyofsky

Either somebody beat me to it, or you already had RFC privileges.

Note that step 1 in the RFC process suggests introducing the RFC idea
to the mailing list first. This way, you can get an initial feel
whether your idea is desired by other developers, and thus whether
it's worth your time.

Good luck with the RFC!

Ilija


Re: [PHP-DEV] GitHub milestones

2024-03-27 Thread Ilija Tovilo
Hi Peter

On Wed, Mar 27, 2024 at 9:44 AM Peter Kokot  wrote:
>
> I was wondering if it would be useful to add GitHub milestones for the 
> PHP-8.4 and PHP-9.0 (or PHP-next or something like this)?
> https://github.com/php/php-src/milestones
>
> Because some pull requests might target versions after the PHP-8.4 and it 
> might be useful to have them additionally sorted to not forget about them. 
> Not to tag all PRs of course but only those which are meant to go into some 
> of the future branches.

Milestones have already been used in the past
(https://github.com/php/php-src/milestones?state=closed), so there's
no reason not to do the same for 8.4 and 9.0. Most likely, we just
forgot. It's probably not documented as part of the release process.

RFCs that have actionable tasks for the next major version should be
listed under "Pending Implementation / Landing" on the RFC page:
https://wiki.php.net/rfc#pending_implementationlanding

Adding them to a GitHub Milestone would make it a bit more explicit.
As for PRs that don't have an RFC, adding them to a milestone
definitely makes sense. However, PRs that may only be merged in the
next major version, and _don't_ require an RFC are extremely rare.

Ilija


Re: [PHP-DEV] Proposal: AS assertions

2024-03-19 Thread Ilija Tovilo
Hi Rowan

On Tue, Mar 19, 2024 at 8:39 PM Rowan Tommins [IMSoP]
 wrote:
>
> As well pattern matching, which Ilija mentioned, another adjacent feature is 
> a richer set of casting operators. Currently, we can assert that something is 
> an int; or we can force it to be an int; but we can't easily say "make this 
> an int if safe, but throw otherwise" or "make this an int if safe, but 
> substitute null/$someValue otherwise".
>
> I've been considering how we can improve that for a while, but not settled on 
> a firm proposal - there's a lot of different versions we *could* support, so 
> choosing a minimal set is hard.

I've thought about this in the context of pattern matching a while
back. I was thinking about something like `$x is ~int`, where the
pattern match is successful iff `$x` is coercible to `int` without
loss of information. Given that patterns may be nested, `array<~int>`
could check that all elements of an array are coercible to `int`. The
same could work for literal patterns, e.g. `~5`, where `5`, `5.0` and
`'5'` are all accepted.

This can potentially be combined with the variable binding pattern,
`$var @ pattern`. The syntax looks a bit confusing at first, but it
basically makes sure that the matched value conforms to `pattern`, and
then binds it to `$var`. Hence, something like `$foo as Foo { $bar @
~int }` would 1. make sure `$foo` is an instance of `Foo`, 2. make
sure `$foo->bar` is coercible to `int`, and then assigned the coerced
value to `$bar`. (It gets more complicated, because the assignment
must be delayed until the entire pattern matches.) If the pattern
matching fails at any point, it throws.

This is just an idea, neither the `as` operator nor the `~` pattern
have been implemented. I don't know whether they are feasible.

Anyway, we're probably going off-topic. :)

Ilija


Re: [PHP-DEV] Proposal: AS assertions

2024-03-19 Thread Ilija Tovilo
Hi Marco

On Tue, Mar 19, 2024 at 7:04 PM Marco Aurélio Deleu  wrote:
>
> > On 19 Mar 2024, at 14:51, Ilija Tovilo  wrote:
> >
> > Hi Robert
> >
> >> On Tue, Mar 19, 2024 at 5:24 PM Robert Landers  
> >> wrote:
> >>
> > See https://wiki.php.net/rfc/pattern-matching#throwing_alternative. I
> > believe this idea would combine nicely with pattern matching. It has
> > many more uses there than just simple class type matching, and could
> > even be used for things like destructuring.
>
> That looks like a PHP dream. Has there been any work regarding that?

https://github.com/iluuu1994/php-src/pull/102/files

The implementation is mostly complete (it might slightly diverge from
the current specification. Bob has called for a different
implementation approach that might be more complex but potentially
easier to optimize, I'll have to play around with it. There are also
still some design decisions that we aren't completely sure about. For
now, Larry and I are just trying to get property hooks over the finish
line.

Ilija


Re: [PHP-DEV] Proposal: AS assertions

2024-03-19 Thread Ilija Tovilo
Hi Robert

On Tue, Mar 19, 2024 at 5:24 PM Robert Landers  wrote:
>
> I've been thinking about this as an RFC for awhile, but with generics
> being far off (if at all), I'd like to propose a useful idea: reusing
> the AS keyword in a different context.
>
> Example:
>
> $x = $attributeReflection->newInstance() as MyAttribute;
>
> This would essentially perform the following code:
>
> assert(($x = $attributeReflection->newInstance()) instanceof MyAttribute);

See https://wiki.php.net/rfc/pattern-matching#throwing_alternative. I
believe this idea would combine nicely with pattern matching. It has
many more uses there than just simple class type matching, and could
even be used for things like destructuring.

Ilija


Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-03-17 Thread Ilija Tovilo
Hi Rowan

On Sun, Mar 17, 2024 at 3:41 PM Rowan Tommins [IMSoP]
 wrote:
>
> The remaining difference I can see in the current RFC which seems to be
> unnecessary is that combining  with set is only allowed on virtual
> properties. Although it may be "virtual" in the strict sense, any 
> hook must actually be referring to some value stored somewhere - that
> might be a backed property, another field on the current class, a
> property of some other object, etc:
>
> public int $foo {  => $this->foo; set { $this->foo = $value; } }
>
> public int $bar {  => $this->_bar; set { $this->_bar = $value; } }
>
> public int $baz {  => $this->delegatedObj->baz; set {
> $this->delegatedObj->baz = $value; } }
>
> This sentence from the RFC applies equally to all three of these examples:
>
>  > That is because any attempted modification of the value by reference
> would bypass a |set| hook, if one is defined.
>
> I suggest that we either trust the user to understand that that will
> happen, and allow combining  and set on any property; or we do not
> trust them, and forbid it on any property.

I'm indeed afraid that people will blindly make their array properties
by-reference, without understanding the implications. Allowing
by-reference behavior for virtual read/write properties is a tradeoff,
for cases where it  may be necessary. Exposing private properties
by-reference is already possible outside of hooks
(https://3v4l.org/VNhf7), that's not something we can prevent for
secondary backing properties. However, we can at least make sure that
a reference to the baking value of a hooked property doesn't escape.

I realize this is somewhat inconsistent, but I believe it is
reasonable. If you want to expose the underlying property
by-reference, you need to jump through some additional hoops.

> > Apart from the things already mentioned, it's unclear to me whether,
> > with such `set;` declarations, a `get`-only backed property should
> > even be legal. With the complete absence of a write operation, the
> > assignment within the `set` itself would fail. To make this work, the
> > absence of `set;` would need to mean something like "writable, but
> > only within another hook", which introduces yet another form of
> > asymmetric visibility.
>
> Any write inside the get hook already by-passes the set hook and refers
> to the underlying property, so there would be no need for any default
> set behaviour other than throwing an error.
>
> It's not likely to be a common scenario, but the below works with the
> current implementation https://3v4l.org/t7qhR/rfc#vrfc.property-hooks
>
> class Example {
>  public int $nextNumber {
>  get {
>  $this->nextNumber ??= 0;
>  return $this->nextNumber++;
>  }
>  // Mimic the current behaviour of a virtual property:
> https://3v4l.org/cAfAI/rfc#vrfc.property-hooks
>  set => throw new Error('Property Example::$nextNumber is
> read-only');
>  }
> }

Again, it depends on how you think about it. As you have argued, for a
get-only property, the backing value should not be writable without an
explicit `set;` declaration. You can interpret `set;` as an
auto-generated hook, or as a marker that indicates that the backing
value is accessible without a hook. As mentioned in my previous
e-mail, auto-generated hooks is something we'd really like to avoid.
So, if the absence of `set;` means that the backing value is not
writable, the hook itself must be exempt from this rule.

Another thing to consider: The current implementation inherits the
backing value and all hooks from its parent. If the suggestion is to
add an explicit `set;` declaration to make it more obvious that the
property is writable, how does this help overridden properties?

```php
class P {
public $prop {
get => strtolower($this->prop);
set;
}
}

class C extends P {
public $prop {
get => strtoupper(parent::$prop::get());
}
}
```

Even though `P::$prop` signals that it is writable, there is no such
indication in `C::$prop`. You may suggest to also add `set;` to the
child, but then what if the parent adds a custom implementation for
`set;`?

```php
class P {
public $prop {
get => strtolower($this->prop);
set {
echo $value, "\n";
$this->prop = $value;
}
}
}

class C extends P {
public $prop {
get => strtoupper(parent::$prop::get());
set;
}
}
```

The meaning for `set;` is no longer clear. Does it mean that there's a
generated hook that accesses the backing field? Does it mean that the
backing field is accessible without a hook? Or does it mean that it
accesses the parent hook? The truth is, with inheritance there's no
way to look at the property declaration and fully understand what's
going on, unless all hooks must be spelled out for the sake of clarity
(e.g. `get => parent::$prop::get()`).

> We are already allowing more than Kotlin by letting hooks call out to a

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-03-16 Thread Ilija Tovilo
Hi Rowan

On Sat, Mar 16, 2024 at 8:23 PM Rowan Tommins [IMSoP]
 wrote:
>
> I still think there will be a lot of users coming from other languages, or 
> from using __get and __set, who will look at virtual properties first. Making 
> things less surprising for those people seems worth some effort, but I'm not 
> asking for a complete redesign.

For clarity, you are asking for a way to make the "virtualness" of
properties more explicit, correct? We touch on a keyword and why we
think it's suboptimal in the FAQ section. Unfortunately, I cannot
think of many alternatives. The `$field` variable made it a bit more
obvious, but only marginally.

I do believe that, for the most part, the user should not have to
think much about whether the property is backed or virtual. The
behavioral differences are mostly intuitive. For example:

```php
class Test {
// This property has a set hook that writes to the backing value. Since
// we're using the backing value, it makes sense for there to be a way to
// retrieve it. Without that, it wouldn't be useful.
public $prop {
set {
$this->prop = strtoupper($value);
}
}

// Similarly, a property with only a get hook that accesses the backing
// value would need a way to write to the property for the get to be useful.
public $prop {
get => strtoupper($this->prop);
}

// A property with a get hook that does not use the backing value does not
// need an implicit set operation, as writing to the backing value would be
// useless, given that nobody will read it.
public $prop {
get => 42;
}

// Similarly, in the esoteric write-only case that does not use the backing
// value, having an implicit get operation would always lead to a
// "uninitialized property" error, and is not useful as such.
public $prop {
set {
echo "Prop set\n";
}
}
}
```

Furthermore, `serialize`, `var_dump` and the other functions operating
on raw property values will include the property only if it is backed.
This also seems intuitive to me: If you never use the backing value,
the backing value would always be uninitialized, so there's no reason
to include it.

One case that is not completely obvious is lazy-initialized properties.

```php
class Test {
public $prop {
get => $this->prop ??= expensiveOperation();
}
}
```

It's not immediately obvious that there is a public set operation
here. The correct way to fix this would be with asymmetric visibility,
which was previously declined. Either way, I don't consider this case
alone enough to completely switch our approach. Please let me know if
you are aware of any other potentially non-intuitive cases.

I will admit that it is unfortunate that a user of the property has to
look through the hook implementation to understand whether a property
is writable. As you have previously suggested, one option might be to
add an explicit `set;` declaration. Maybe it's a bit more obvious now,
after my previous e-mail, why we are trying to avoid this.

Apart from the things already mentioned, it's unclear to me whether,
with such `set;` declarations, a `get`-only backed property should
even be legal. With the complete absence of a write operation, the
assignment within the `set` itself would fail. To make this work, the
absence of `set;` would need to mean something like "writable, but
only within another hook", which introduces yet another form of
asymmetric visibility.

> > Dynamic properties are not particularly relevant today. The point was
> > not to show how similar these two cases are, but to explain that
> > there's an existing mechanism in place that works very well for hooks.
> > We may invent some new mechanism to access the backing value, like
> > `field = 'value'`, but for what reason? This would only make sense if
> > the syntax we use is useful for something else. However, given that
> > without guards it just leads to recursion, which I really can't see
> > any use for, I don't see the point.
>
> I can think of several reasons we *could* explore other syntax:
>
> 1) To make it clearer in code whether a particular line is accessing via the 
> hooks, or by-passing them 2) To make the code in the hooks shorter (e.g. 
> `$field` is significantly shorter than `$this->someDescriptiveName`) 3) To 
> allow code to by-pass the hooks at will, rather than only when called from 
> the hooks (e.g. having a single method that resets the state of several 
> lazy-loaded properties)
>
> Those reasons are probably not enough to rule out the current syntax; but 
> they show there are trade-offs being made.

Fair enough. 1 and 2 are reasons why we added the `$field` macro as an
alternative syntax in the original draft. I don't quite understand
point 3. In Kotlin, `field` is only usable within its associated hook.
Other languages I'm aware of do not provide a way to access the
backing value directly, neither inside nor 

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-03-16 Thread Ilija Tovilo
Hi Rowan

On Sat, Mar 16, 2024 at 9:32 AM Rowan Tommins [IMSoP]
 wrote:
>
> On 16 March 2024 00:19:57 GMT, Larry Garfield  wrote:
>
> >Well, reading/writing from within a set/get hook is an obvious use case to 
> >support.  We cannot do cached properties easily otherwise:
> >
> >public string $expensive {
> >  get => $this->expensive ??= $this->compute();
> >  set {
> >if (strlen($value) < 50) throw new Exception();
> >$this->expensive = $value;
> >  }
> >}
>
>
> To play devil's advocate, in an implementation with only virtual properties, 
> this is still perfectly possible, just one declaration longer:
>
> private string $_expensive;
> public string $expensive {
>   get => $this->_expensive ??= $this->compute();
>   set {
> if (strlen($value) < 50) throw new Exception();
> $this->_expensive = $value;
>   }
> }
>
> Note that in this version there is an unambiguous way to refer to the raw 
> value from anywhere else in the class, if you wanted a clearAll() method for 
> instance.
>
> I can't stress enough that this is where a lot of my thinking comes from: 
> that backed properties are really the special case, not the default. Anything 
> you can do with a backed property you can do with a virtual one, but the 
> opposite will never be true.
>
>
> The minimum version of backed properties is basically just sugar for that - 
> the property is still essentially virtual, but the language declares the 
> backing property for you, leading to:
>
> public string $expensive {
>   get => $field ??= $this->compute();
>   set {
> if (strlen($value) < 50) throw new Exception();
> $field = $value;
>   }
> }
>
> I realise now that this isn't actually how the current implementation works, 
> but again I wanted to illustrate where I'm coming from: that backed 
> properties are just a convenience, not a different type of property with its 
> own rules.

That's not really how we think about it. Our design decisions have
been guided by a few factors:

1. The RFC intentionally makes plain properties and properties with
hooks as fully compatible as possible.

A subclass can override a plain property by adding hooks to it. Many
other languages only allow doing so if the parent property already has
generated accessors (`{ get; set; }`). For many of them, switching
from a plain property to one with accessors is actually an ABI break.
One requires generating assembly/IR instructions that access a field
in some structure, the other one is a method call. This is not
relevant in our case.

In most languages, a consequence of `{ get; set; }` is that such
properties cannot be passed by reference. This part _is_ relevant to
PHP, because PHP makes heavy use of explicit by-reference passing for
arrays, but not much else. However, as outlined in the RFC, arrays are
not a good use-case for hooks to begin with. So instead of fragmenting
the entirety of all PHP code bases into plain and `{ get; set; }`
properties where it doesn't actually make a semantic difference, and
then not even using them when it would matter (arrays), we have
decided to avoid generated hooks altogether.

The approach of making plain and hooked properties compatible also
immediately means that a property can have both a "backing value"
(inherited from the parent property) and hooks (from the child
property). This goes against your model that backed properties are
really just two properties, one for the backing value and a virtual
one for the hooks.

Our approach has the nice side effect of properties only containing
hooks when they actually do something. We don't need to deal with
optimizations like "the hook is auto-generated, revert to accessing
the property directly to make it faster", or even just having the
generated hook taking up unnecessary memory. You can think of our
properties this way:

```php
class Property {
public ?Data $storage;
public ?callable $getHook;
public ?callable $setHook;

public function get() {
if ($hook = $this->getHook) {
return $hook();
} else if ($storage) {
return $storage->get();
} else {
throw new Error('Property is write-only');
}
}

public function set($value) {
if ($hook = $this->setHook) {
$hook($value);
} else if ($storage) {
$storage->set($value);
} else {
throw new Error('Property is read-only');
}
}
}
```

Properties can inherit both storage and hooks from their parent.
Hopefully, that helps with the mental model. Of course, in reality it
is a bit more complicated due to guards and references.

2. Although you say backed properties are just syntactic, they really
are not. For example, renaming a public property, making it private
and replacing it with a new passthrough virtual property breaks
serialization, as serialization works on the object's raw values. On
the other hand, adding a hook to an existing property doesn't
influence its backing 

Re: [PHP-DEV] automatic formatting checks for pull requests?

2024-02-18 Thread Ilija Tovilo
On Sun, Feb 18, 2024 at 4:11 PM Gina P. Banyard  wrote:
>
> On Saturday, 17 February 2024 at 22:18, Ilija Tovilo  
> wrote:
>
> > * The new code style should be applied only to newly added sections or
> > changed code, not entire files. Otherwise, we'll have many changes in
> > large files, with endless merge conflicts when merging up from lower
> > branches.
>
> Surely the best way is to apply the formatting tool on all branches that are 
> supported (even in security support).
> Have them be merged upwards, and then add the revisions of the commits to a 
> .git-blame-ignore-revs file so that git blame doesn't care about them.
>
> This should resolve the issue of making future merges difficult.

Presumably, this would lead to merge conflicts in every open pull
request. Maybe a resolution strategy could be automated, not ideal
nonetheless.

Additionally, given that the PR has discovered a clang-format bug that
changes behavior
(https://github.com/php/php-src/pull/13417#issuecomment-1950920114),
I'd be wary of applying the formatting blindly to our stable branches.


Re: [PHP-DEV] automatic formatting checks for pull requests?

2024-02-17 Thread Ilija Tovilo
Hi Hans

On Sat, Feb 17, 2024 at 3:31 PM Gina P. Banyard  wrote:
>
> On Saturday, 17 February 2024 at 11:24, Hans Henrik Bergan  
> wrote:
>
> > Can we add automatic formatting checks for pull requests?
> > Made a PR: https://github.com/php/php-src/pull/13417
>
> It would be nice to have some formatting rules to harmonize the codebase as 
> it is somewhat the wild west,
> but as far as my understanding goes is that Clang format struggles to 
> understand our codebase (namely macros) and is difficult to set-up for 
> php-src.

Right. Consistent code style is nice, but what we have now is really
not that bad. There are a couple things I'd want if we enforce code
style:

* Fixing the style should be easy, running a single command without
first pushing to CI.
* It should be fast too, so that I can easily run it for every commit,
preferably even on-save in my editor.
* The new code style should be applied only to newly added sections or
changed code, not entire files. Otherwise, we'll have many changes in
large files, with endless merge conflicts when merging up from lower
branches.
* The formatting tool should work for all php-src code, not just plain
C code. We don't want to be forced to refactor old macros just because
we need to add a single line to some long-standing code. Last time I
tried clang-format, it utterly failed with our macros.

I haven't looked at your PR in detail, so I'm not sure which of these
points it satisfies. It would be great if you could quickly describe
how it works, and what the goals are.

Essentially, I'm just sceptical that this isn't more trouble than it's worth.

Ilija


Re: [PHP-DEV] RE: Testing new list server

2024-02-17 Thread Ilija Tovilo
Hi

On Sat, Feb 17, 2024 at 12:17 AM Jorg Sowa  wrote:
>
> Hello Derick,
> there is something wrong. I don't get all of the emails from the new setup, 
> only part. Examples of emails I didn't receive:
> - https://externals.io/message/122391
> - https://externals.io/message/122390
> - https://externals.io/message/122388
>
> I'm using Gmail and Spam doesn't contain any of them.

Same here. I'm using Gmail and have not received various e-mails.

Ilija


Re: [PHP-DEV] Requesting RFC karma

2024-02-17 Thread Ilija Tovilo

Hi Hans

On 16.02.24 13:05, Hans Henrik Bergan wrote:

My name is "Hans Henrik Bergan", usually go by the nickname
"divinity76", I've contributed to OSS (including PHP) for years, and
am currently involved in 3 things that might require an RFC, and
requesting RFC karma for wiki account "divinity76".


RFC karma was granted, good luck!

Ilija


Re: [PHP-DEV] Registration apply for php wiki

2024-02-17 Thread Ilija Tovilo
Hi JaeHan!

On Fri, Feb 16, 2024 at 10:53 AM 하늘아부지  wrote:
>
> Hi, My name is JaeHan Seo. (wiki username is daddyofsky)
>
> I hope I can make a proposal by registering on the php wiki.

I can give you karma. Usually, RFC karma (which is what I'm guessing
you're asking for) is granted for specific RFC ideas. Could you
quickly summarize what your idea is? See step 1 in the RFC howto:

https://wiki.php.net/rfc/howto

Thank you!
Ilija


Re: [PHP-DEV] [Discussion] Thoughts on casting to null

2024-02-14 Thread Ilija Tovilo
Hi Robert

On Wed, Feb 14, 2024 at 1:29 AM Robert Landers  wrote:
>
> I won't be the first to say this, at first glance, casting to null
> sounds silly, but short arrow functions must always return something,
> by design. That's when casting to null makes any sense at all (that I
> can think of): you want to write a succinct, short function but
> guarantee the result is discarded. Instead, if you really must use a
> short array function, you have to do something even weirder:
>
> EventLoop::repeat($pingInterval, fn() => $client->ping() ? null : null);

While (void) $client->ping() would solve your problem, it's not very
useful outside this scenario.

I think there are two issues you're implicitly referring to.

1. Arrow functions cannot be void, because they always return something.
2. Arrow functions cannot contain multiple statements.

As for the former, PHP actually had a similar issue for never closures
that was solved a while ago [1]. In the report I suggested that the
same could be done for void, by making void arrow functions evaluate
and drop the right hand side of =>, and always return nothing (i.e.
null). This would solve your issue, although probably mostly by
accident (because void functions return null, and your caller expects
exactly null). Regardless, I think this change would be useful, if
just to signal that the return value of an arrow function is not
intended to be used.

The latter would require some sort of block or grouped expression.
Short closures have been discussed extensively in the past, so I won't
get into that. There's also the comma operator in some languages like
C and JavaScript that evaluates a list of expressions and returns the
result of the last one, although probably not universally liked (i.e.
fn () => ($client->ping(), null)).

[1] 
https://github.com/php/php-src/commit/f957e3e7f17eac6e9195557f3fa79934f823fd38

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] php-src docs

2024-02-12 Thread Ilija Tovilo
Hi Yuya

It seems you accidentally sent your response to me instead of the list.

On Sun, Feb 11, 2024 at 5:10 PM youkidearitai  wrote:
>
> 2024年2月11日(日) 21:18 Ilija Tovilo :
> >
> > Hi everyone.
> >
> > I would like to start an initiative to centralize documentation of the
> > PHP internals.
> > https://github.com/php/php-src/pull/13338
> > https://iluuu1994.github.io/php-src/ (will be moved to php.github.io
> > once merged)
> >
> > Let me know of any thoughts and suggestions you might have.
>
> Hi, Ilija.
> Thank you for your great suggestion.
>
> It seems make sense to have a set of documents about the structure of
> php-src in php-src.
> Easily create pull requests to them.
>
> Although I have to learn reStructuredText, It is not seems major problem.

For some context, I initially planned to go with the mdBook from the
Rust project (https://github.com/rust-lang/mdBook/) as Markdown is a
bit more approachable. After writing the sample zval chapter, I
noticed some pain points in terms of formatting, most significantly
tables. That said, reStructuredText is far from perfect itself.

As mentioned previously, the other reason for choosing Sphinx was that
it is quite extensible.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] php-src docs

2024-02-12 Thread Ilija Tovilo
Hi Mönôme

On Sun, Feb 11, 2024 at 2:20 PM Mönôme Epson  wrote:
>
> > centralize documentation of the PHP internals
> I'm glad to hear that you're planning to centralize the documentation for PHP 
> internals.
>
> > Let me know of any thoughts and suggestions you might have.
> I have a preference for devel-docs/ instead of docs/ . This would make the 
> doc-en repository a PHP subproject.

Can you clarify? Do you mean a separate repository called devel-docs?
I don't hear the term "devel" much outside of package managers. The
suggestion was to put it directly in the php-src repository, where
there's not much confusion about its content (precisely the php-src
repository).

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] php-src docs

2024-02-11 Thread Ilija Tovilo
Hi everyone.

I would like to start an initiative to centralize documentation of the
PHP internals. As mentioned in a recent e-mail, there's a lot of
useful documentation scattered across the internet, like the PHP
internals book, various blogs of contributors, wiki.php.net, to name a
few. Information is currently hard to discover over multiple mediums.
Some of these mediums are also prone to go stale and can't be updated
by internals (e.g. blog posts).

After a brief discussion at the foundation, we think it's best to
incorporate the documentation directly into the php-src repository.
This makes it very easy to discover, contribute to, and allows
updating documentation right alongside its technical changes. To make
browsing the documentation easier, the documentation is built with
Sphinx and published to GitHub Pages.

I've prepared a PR here:
https://github.com/php/php-src/pull/13338
https://iluuu1994.github.io/php-src/ (will be moved to php.github.io
once merged)

Let me know of any thoughts and suggestions you might have.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Why are serialized strings wrapped in double quotes? (s::"")

2024-02-07 Thread Ilija Tovilo
Hi Sandy

On Tue, Feb 6, 2024 at 9:19 PM Sanford Whiteman  wrote:
>
> I'd like a little background on something we've long accepted: why
> does the serialization format need double quotes around a string, even
> though the byte length is explicit?
>
> Example:
>
>   s:5:"hello";
>
> All else being equal I would think we could have just
>
>   s:5:hello;
>
> Was this just to make strings look more 'stringy', even though the
> format isn't meant to be human-readable?

I don't have the historical context, but I'm assuming that's it. PHPs
serialization format is not efficient, and I don't think that was ever
the primary focus. If you need something more efficient, you can try
https://github.com/igbinary/igbinary which is aimed to be a drop-in
replacement.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] Re: [RFC][Vote] RFC1867 for non-POST HTTP verbs

2024-02-05 Thread Ilija Tovilo
Hi everyone

On Mon, Jan 22, 2024 at 10:23 AM Ilija Tovilo  wrote:
>
> I started the vote on the "RFC1867 for non-POST HTTP verbs" RFC.
> https://wiki.php.net/rfc/rfc1867-non-post

The RFC has been accepted with 23 yes and 1 no vote. As promised to
Sara, I will be setting up a poll for the suggested, alternative
function name in the coming days.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Discussion: making continue and break into an expression

2024-01-25 Thread Ilija Tovilo
Hi Larry

On Thu, Jan 25, 2024 at 6:38 PM Larry Garfield  wrote:
>
> On Thu, Jan 25, 2024, at 11:28 AM, Ilija Tovilo wrote:
>
>
> > This leads to very similar issues as break/continue inside blocks. See:
> > https://wiki.php.net/rfc/match_blocks#technical_implications_of_control_statements
> >
> I'm curious, how did `throw` expressions manage to avoid these issues?  Or 
> was it just "Ilija did the hard work of tracking down the weirdness?"

Can't really take the credit for this. This issue went over my head,
as this was my first RFC.

Exceptions work a bit differently, in that they use something called
live-ranges. Essentially, we look at the generated opcodes and figure
out which variables are "live" (i.e. valid and unfreed) during which
opcodes. For something like echo foo() + bar():

Pseudo opcodes:

 V1 = CALL foo
0001 V2 = CALL bar
0003 V3 = ADD V1 V2
0004 ECHO V3

V1 would be live for -0003, V2 for 0001-0003, V3 for 0003-0004. If
an exception is thrown (or rethrown across function boundaries) the VM
checks which temporary variables are currently live and frees them. So
if CALL bar were to throw, we'd see that V1 is currently live and
needs to be freed. For something like foo() + throw new Exception(),
if you replace the second CALL with a throw, you'll see that the
live-range for V1 doesn't change, and so this "just works".

There was, however, a related issue with the optimizer.

echo foo() + throw new Exception();

 V1 = CALL foo
0001 THROW
0003 V3 = ADD V1 false
0004 ECHO V3

Where the optimizer would remove the dead instructions after the
throw, breaking live-range analysis.

 V1 = CALL foo
0001 THROW

V1 no longer had a consuming opcode, and as such the algorithm could
no longer determine the live-range of V1. This would cause V1 to leak.
The solution was simply to disable dead code elimination for this
case. The solution was suggested by Tyson Andre and implemented by
Nikita.

In theory, break/continue expressions might try to re-use live-ranges.
I recall thinking about this, but I can't seem to remember if there
was a reason not to do it.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Discussion: making continue and break into an expression

2024-01-25 Thread Ilija Tovilo
Hi Robert

On Thu, Jan 25, 2024 at 10:16 AM Robert Landers
 wrote:
>
> Now that throwing is an expression, it allows for some very concise
> programming. What are your thoughts on making a break/continue into an
> expression as well?
>
> Instead of:
>
> while(true) {
> ...
> if(is_null($arr['var'])) continue;
> if($something) continue; else break;
> ...
> }
>
> You could write
>
> while(true) {
> ...
> $arr['var'] ?? continue;
> $something ? continue : break;
> ...
> }

This leads to very similar issues as break/continue inside blocks. See:
https://wiki.php.net/rfc/match_blocks#technical_implications_of_control_statements

I'll try to explain.

The VM works with temporary variables. For the expression foo() +
bar() two temporary variables for the result of foo() and bar() will
be created, which are then used for the + operation. Normally, + will
consume both operands, i.e. use and then free them. However, with
break/continue etc. being expressions, it would become possible to
skip over the consuming instructions.

do {
echo foo() + break;
} while (true);
echo 'Done';

Pseudo opcodes:

 V1 = CALL foo
0001 JMP 0005
0002 V2 = ADD V1 false ; false is here represents a bottom value
that will never actually be used
0003 ECHO V2
0004 JMP 
0005 ECHO 'Done'

Since JMP will skip over the ADD instruction, V1 remains unused. A
similar problem already exists for break/continue in foreach itself.

foreach ($foos as $foo) {
   foreach ($bars as $bar) {
   break 2;
   }
}

foreach holds a copy of $bars (in case it gets modified) that normally
gets cleaned up when the loop ends. With break over multiple
loop-boundaries, we can completely skip over this freeing mechanism.
PHP solves this by inserting an explicit FE_FREE instruction before
the break 2, which itself is essentially just a JMP to the end of the
outer loop.

Hopefully it's now more evident why this is a problem:

while (true) {
   foo() && break;
}

foo() returns a value that would normally be consumed by the &&
operation. However, with break, we may skip over the && operation
entirely. As such, the break itself becomes responsible for freeing
these values. This requires significant changes in the compiler to
track variables that are currently "live" (i.e. haven't been consumed
yet), and emitting FREE opcodes for them as needed. I've implemented
this for match blocks here:

https://github.com/php/php-src/compare/master...iluuu1994:php-src:match-blocks-var-tracking

However, note that due to complexity, I've decided to disallow using
break/continue and the likes in such contexts to avoid this issue
completely, which isn't possible for what you are suggesting.

There's another related issue.

foo(bar(), break);

Function calls in PHP consist of multiple instructions, namely an
INIT_CALL, 0-n SEND and a DO_CALL opcode. INIT_CALL creates a stack
frame, SEND pushes arguments onto the stack frame, and DO_CALL starts
the execution of the function and frees both arguments and stack frame
when the function ends. If prior to a SEND opcode we break, we skip
over the DO_CALL, so the stack frame needs to be freed manually.

The patch linked above solves this by inserting CLEAN_UNFINISHED_CALLS
opcodes that do as the name suggests. This mechanism is already used
for exceptions. This should work for you, but was insufficient for
match blocks, for reasons I won't get into here.

All this to say: Don't expect the implementation here to be trivial.

Regards,
Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Re: Wiki Access request

2024-01-24 Thread Ilija Tovilo
On Wed, Jan 24, 2024 at 1:22 PM Ilija Tovilo  wrote:
>
> Thank you for the list. It looks more digestible, and most of it is
> already congruent with CONTRIBUTING.md.
>
> * https://www.zend.com/resources/writing-php-extensions requires
> sharing your contact infomation to obtain. I'm not sure how other
> people feel about this.

Oh, nevermind. That sidebar is clickable. I missed that there's an
online version.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Re: Wiki Access request

2024-01-24 Thread Ilija Tovilo
Hi Carlos

On Wed, Jan 24, 2024 at 12:29 PM Barel  wrote:
>
> > Feel free to share this list here, on GitHub or otherwise. I'm
> > skeptical whether throwing partially outdated resources at people is
> > actually helpful.
>
> I trimmed the list that I collected so that it only included the most
> significant items which have a lot of information. This list would be:
> - https://www.phpinternalsbook.com/ PHP Internals book
> - https://phpinternals.net/ PHP Internals web site with documentation about
> a lot of the structures and macros used in the code
> - https://www.npopov.com/ Nikita Popov's blog
> - http://blog.jpauli.tech/ Julien Pauli's blog
> - https://phpinternals.news/ Derick Rethans' podcast
> - https://www.zend.com/resources/writing-php-extensions Zend's guide about
> writing PHP extensions
> - https://wiki.php.net/internals The internals page in the wiki
> - https://www.informit.com/store/extending-and-embedding-php-9780672327049
> Sara Golemon's printed book
>
> Do you think this sounds good? If you do, I will create the PR to update
> the contributing doc

Thank you for the list. It looks more digestible, and most of it is
already congruent with CONTRIBUTING.md.

* Juliens blog is a bit large, but contains some very detailed posts
that should still be relevant. It might make sense to list them
explicitly.
* https://www.zend.com/resources/writing-php-extensions requires
sharing your contact infomation to obtain. I'm not sure how other
people feel about this.
* https://www.informit.com/store/extending-and-embedding-php-9780672327049
might be outdated, as it was published back in the PHP 5.1 era. I'll
leave it up to Sara whether she thinks the book should be recommended
at this time.

The rest seems to be already listed.

Regards,
Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][Vote] RFC1867 for non-POST HTTP verbs

2024-01-24 Thread Ilija Tovilo
Hi Sara

Thank you for your feedback.

On Tue, Jan 23, 2024 at 8:41 PM Sara Golemon  wrote:
>
> On Mon, Jan 22, 2024 at 1:24 AM Ilija Tovilo  wrote:
>
> > I started the vote on the "RFC1867 for non-POST HTTP verbs" RFC.
> > https://wiki.php.net/rfc/rfc1867-non-post
> >
>
> 1/ This function reaches into the SAPI to pull out the "special" body
> data.  That's great, but what about uses where providing an input string
> makes sense.  For that, and for point 2, I'd suggest
> `http_parse_query(string $query, ?array $options = null): array|object`.

The RFC previously included support for the $input_stream variable
(string is not very appropriate for multipart because the input may be
arbitrarily large). The implementation wasn't complex, but it required
duplication of all the reads to support both a direct read from the
SAPI and a read from the stream, duplication of some limit checks and
special passing of the streams to avoid SAPI API breakage.

As for actual use cases, I found limited evidence that this function
would be useful for worker-based services _right now_. Most services
handle request parsing in some other layer. For example, RoadRunner
has a Go server that stores the file to disk, and then just passes the
appropriate path to PHP in the $_FILES array. It seems to me that a
custom input would be useful exclusively for a web server written in
PHP. The one that was pointed out to me (AdapterMan) handles requests
as strings, which would not scale for multipart requests.

I don't mind getting back to this if AdapterMan rewrites request
handling to use streams. Adding back the $input_stream parameter can
be done with no BC breaks. But for the time being, I don't think the
motivation is big enough to justify the added complexity.

Additionally, because multipart is used exclusively as a request
content type, it isn't useful in a general sense either, because a PHP
request will typically only receive one request (but potentially
multiple responses, in case it communicates with other servers).

> 2/ `request_` represents a new psuedo-namespace, functions are easier to
> find and associate if we keep them grouped.  I recommend 'http_` because it
> compliments the very related function `http_build_query()`, and for the
> version of this function which refers directly to the request:
> `http_parse_request(?array $options = null) : array|object`.

That's fair. If the name bothers you I can create an amendment RFC. I
think http_parse_body() would be a bit more appropriate, because
request implies more than just the body.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Re: Wiki Access request

2024-01-22 Thread Ilija Tovilo
Hi Carlos

You should now have access to the /internals sub-pages.

On Mon, Jan 22, 2024 at 11:38 AM Barel  wrote:
>
> I didn't know that there was a list of tech resources listed in the
> "CONTRIBUTING.md" file. I have been researching resources that have
> information about PHP internals and have found a lot, more than twenty. I
> can think of three possibilities:
> - List them all in the contributing document but this can be a bit
> overwhelming as, like I said, the number of resources available is big

Feel free to share this list here, on GitHub or otherwise. I'm
skeptical whether throwing partially outdated resources at people is
actually helpful.

> - Add a new page in the Github repo with all these links and link to that
> page from the contributing page
> - Keep the list in the wiki and link to that page from the contributing page

One of the reasons I'd like to move off of the wiki for these things
is that there is no review process. We highly value volunteer work,
but trusting new contributors to blindly make changes to official
guides is obviously somewhat problematic.

I'd prefer to make CONTRIBUTING.md the "official list" for now. This
file is allowed to be long. And as mentioned, I suspect that the list
of 20+ items may be trimmed.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] [RFC][Vote] RFC1867 for non-POST HTTP verbs

2024-01-22 Thread Ilija Tovilo
Hi everyone

I started the vote on the "RFC1867 for non-POST HTTP verbs" RFC.
https://wiki.php.net/rfc/rfc1867-non-post

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Newly Created Wiki Account - Quick Introduction

2024-01-22 Thread Ilija Tovilo
Hi Jair!

On Mon, Jan 22, 2024 at 5:14 AM Jair Humberto  wrote:
>
> My name is Jair Humberto, I am brazilian and have been working with PHP
> since 2006. I am super excited about the new things PHP has launched lately
> and finally realised that I can contribute as well. I think this is a new
> fase of my career, I want to learn a bit more about the php source code and
> processes and I plan to start contributing with small things, but my first
> main goal (long term) is to make possible generics in PHP, one day!

Welcome! Note that you do not need a wiki account for contributing to
PHP. Source code is managed on GitHub (https://github.com/php/php-src)
and changes are made via pull requests. Nowadays, you'll only really
need a wiki account to create RFCs.

Check the CONTRIBUTING.md file for some good resources on the internals of PHP.
https://github.com/php/php-src/blob/master/CONTRIBUTING.md#technical-resources

If you need any guidance, feel free to contact me.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Wiki Access request

2024-01-21 Thread Ilija Tovilo
Hi Carlos!

On Fri, Jan 19, 2024 at 4:56 PM Barel  wrote:
>
> This page in the Wiki https://wiki.php.net/internals/references has a lot
> of links which are outdated and should be removed or changed. If you can
> provide wiki edit access for me I can work on updating them and also try to
> find more recent links to add to that reference page (suggestions welcome!!)
>
> My username for the wiki site is "barelon" in case this is needed

I can absolutely give you write access to these pages. Updating this
list to reflect more up-to-date resources certainly makes sense.

As you probably know, there are a number of different places where php
internals are documented, and I think that, long-term, it makes sense
to try to consolidate these efforts. We briefly spoke about this in
the last foundation meeting.

We have some documentation in the php-src repository itself, a
significant amount in the php internals book
(https://www.phpinternalsbook.com/), some in the wiki, some on blogs
of current or previous contributors, etc. There are a number of things
that are important when it comes to documentation, like convenience,
access, history, discoverability, etc.

While not the worst, I don't think the wiki is the best place for this
work. Handling documentation directly through Git in PHPs main
repository (or at least a repository in the PHP organization) would
likely tick the most boxes. Providing documentation with PRs might
also improve the understanding of intention of the changes for
reviewers.

As for links to other references, I believe the CONTRIBUTING.md file
is currently the most up-to-date.

https://github.com/php/php-src/blob/master/CONTRIBUTING.md#technical-resources

Rather than duplicating this list on the wiki, it might make sense to
reference the CONTRIBUTING.md file, and extend it as necessary.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] Re: [RFC][Under discussion] RFC1867 for non-POST HTTP verbs

2024-01-17 Thread Ilija Tovilo
Hi Joan

Sorry for the late response.

On Thu, Dec 14, 2023 at 6:08 PM Joanhey  wrote:
>
> We can't use sapi_module.read_post() from CLI.
>
> https://github.com/joanhey/AdapterMan
> This runtime use the CLI-SAPI, but this SAPI is very limited. We can use
> parse_str() easily for 'application/x-www-form-urlencoded' but we need
> to replicate in userland for 'multipart/form-data'.
> https://github.com/joanhey/AdapterMan/blob/master/src/Http.php#L410-L416
> https://github.com/joanhey/AdapterMan/blob/master/src/ParseMultipart.php

Yes, a web server written in PHP is indeed the one use-case for the
$input_stream parameter.

Looking at AdapterMan, it looks like you're handling requests as
strings. 
https://github.com/joanhey/AdapterMan/blob/4171d0218a253b2b4c178af067bd4601dd4daf80/src/ParseMultipart.php#L23

It doesn't seem like this would scale well for multipart requests. Do
you reckon this can be rewritten to use streams instead? Otherwise the
feature seems half-baked.

I'm going forward with the RFC as is. I'm not against re-adding
support for $input_stream at a later point in time. But it should be
demonstrated that AdapterMan can actually make good use of it.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [VOTE] [RFC] Final-by-default anonymous classes

2024-01-15 Thread Ilija Tovilo
Hi Daniil

On Mon, Jan 15, 2024 at 11:36 AM Daniil Gentili
 wrote:
>
> Hi all,
>
> I've opened voting for the final-by-default anonymous classes RFC:
> https://wiki.php.net/rfc/final_by_default_anonymous_classes

It seems you've edited the text of the poll. Doing so disassociates
the existing votes from the poll, so the existing votes were gone. I
reverted the change, the votes are now back.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC karma request

2023-12-28 Thread Ilija Tovilo
Hi Valentin

On Thu, Dec 28, 2023 at 1:31 PM Valentin Udaltsov
 wrote:
>
> I kindly request RFC Karma for my wiki account vudaltsov. I am planning to
> publish RFC "new MyClass()->method() without parentheses". I already
> created a PR, where I got generally positive feedback on this feature:
> https://github.com/php/php-src/pull/13029

RFC karma was granted. Good luck!

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Re: [RFC][Under discussion] RFC1867 for non-POST HTTP verbs

2023-12-08 Thread Ilija Tovilo
Hi Sam

>> On Fri, Oct 6, 2023 at 3:44 PM Ilija Tovilo  wrote:
>> > https://wiki.php.net/rfc/rfc1867-non-post
On Thu, Dec 7, 2023 at 6:04 PM Sam I  wrote:
>
> Hey, I'm not sure if this is bikeshedding, but the concept of parsing bodies 
> for non-POST requests lands really close to a proposal for adding a QUERY 
> method to the HTTP standard.
> Draft: 
> https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-safe-method-w-body
> Discussion: 
> https://github.com/httpwg/http-extensions/labels/safe-method-w-body
>
> It's meant to address the recent need for complex querying (GraphQL / Elastic 
> Search) that necessitates using POST but loses the default caching of GET.
> I think this RFC could serve as the groundwork for supporting QUERY if it's 
> extended to other MIME types in the future as Larry suggested. But QUERY 
> probably still has years to go before there is a consensus on it (I think 
> it's been talked about for 6+ years now)

Looking at the RFC, it doesn't seem like multipart is an intended
format for QUERY requests. application/x-www-form-urlencoded is
intended and should work as-is with request_parse_body(). Parsing
anything other than multipart and form-data is possible, but not at
all related to QUERY (and IMO not desirable).

You can implement a QUERY endpoint in PHP today (if your web server
supports it) by reading from php://input. It's worth noting that PHPs
built-in server does not support custom HTTP methods and will return a
501.

I don't think there's anything actionable here. Please clarify if I'm
missing something.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] bugs.php.net still active?

2023-12-07 Thread Ilija Tovilo
Hi Aleksander

On Thu, Dec 7, 2023 at 9:22 AM Aleksander Machniak  wrote:
>
> I was under impression that bugs.php.net was supposed to be phased out.
> I.e. made read-only or something.
>
> https://bugs.php.net/bug.php?id=78628=1 proves that it's not the
> case and I'm receiving annoying spam recently.

>From the GitHub issues RFC:

> Per the above, bugs.php.net will remain active for the following purposes:
> * Reporting of security issues against PHP.
> * Commenting/updating on existing issues.

https://wiki.php.net/rfc/github_issues#bugsphpnet

Security issues have since been moved to GitHub. However,
commenting/updating bugs is still possible. IMO it would make sense to
limit this functionality to admins, and encourage users who want to
add context to create a new issue on GitHub. It has much better spam
protection, and visibility will be higher there too.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] Re: [RFC][Under discussion] RFC1867 for non-POST HTTP verbs

2023-12-05 Thread Ilija Tovilo
Hi everyone

On Fri, Oct 6, 2023 at 3:44 PM Ilija Tovilo  wrote:
>
> I'd like to announce an RFC that proposes adding a new function called
> parse_post_data() to expose the existing functionality to userland, so
> that the mechanism can be used for other HTTP verbs.
>
> https://wiki.php.net/rfc/rfc1867-non-post

I took a closer look at RoadRunner regarding file uploads and noticed
that $input_stream will not be useful to it after all. RoadRunner
handles files directly in its Go server by parsing the multipart body,
storing any files to disk, and only transferring the file handles
(along with any post data) to the PHP workers. New SAPIs could instead
tweak sapi_module.read_post() when handling a new request. We can add
these parameters if somebody can present a valid use-case, for now I
opted to remove the $input_stream and $content_type parameters.

Please let me know if you have any more feedback. I will wait at least
2 weeks before going forward with a vote, as this is a bigger change
to the RFC.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Reproducible Builds

2023-11-28 Thread Ilija Tovilo
Hi Sebastian

On Tue, Nov 28, 2023 at 6:28 PM Sebastian Bergmann  wrote:
>
> I recently watched a video [1] that once again brought the topic of
> reproducible builds [2] to my attention.
> ...
> I have not yet checked whether usage of the __DATE__ and __TIME__ macros
> is the only thing that makes the compilation of PHP irreproducible, but no
> longer using them would be a good start on the path towards reproducible
> builds.

At least for core, enabled-by-default extensions, __DATE__ and
__TIME__ seem to be the only variables. I can get reproducible builds
by setting SOURCE_DATE_EPOCH.

> While we could probably replace __DATE__ and __TIME__ with
> SOURCE_DATE_EPOCH [3] [4], ...

Both GCC and Clang support SOURCE_DATE_EPOCH out of the box, setting
__DATE__ and __TIME__ accordingly. MSVC (shockingly) does not.
However, reproducible builds likely don't matter as much for Windows
since we provide the binaries for it.

That said, I wouldn't object to removing the date either.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][Discussion] Harmonise "untyped" and "typed" properties

2023-11-16 Thread Ilija Tovilo
Hi Rowan,

Thanks for the RFC.

On Thu, Nov 16, 2023 at 9:42 PM Rowan Tommins  wrote:
>
> I have finally written up an RFC I have been considering for some time:
> Harmonise "untyped" and "typed" properties
>
> RFC URL: https://wiki.php.net/rfc/mixed_vs_untyped_properties

Unifying the unset behavior sounds sensible. I don't see a good reason
for a declared property to be hidden through unset. If this behavior
is desired, the property should be declared dynamically in the
constructor with #[AllowDynamicProperties] added to the class.

Like Jakub, I am also worried about the BC break of changing the
default value to uninitialized. However, I don't think opt-in behavior
is worthwhile (because I don't believe people make sufficient use of
it, while adding something new we have to support forever). If we pick
either of the options you presented (1. initialize all nullable
properties to null, 2. Make all properties uninitialized) I'd vote for
the first one. Then again, given (it seems) people are happier with
the strict behavior of typed properties, do we need to unify the
behavior at all if it means losing that? Currently, they can choose
the behavior  they prefer by adding or omitting mixed.

Furthermore, the first approach clashes somewhat with readonly.
Readonly can't have null as a default value because that would make it
legal to access the value before the property is explicitly
initialized, and the property would no longer be uninitialized, making
the explicit assignment illegal. We could just not add the default
value for readonly, but that means more special rules that we're
trying to avoid.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] [Discussion] Release cycle update

2023-11-10 Thread Ilija Tovilo
Hi Jakub

Thank you for the proposal.

On Fri, Nov 10, 2023 at 5:52 PM Jakub Zelenka  wrote:
> https://wiki.php.net/rfc/release_cycle_update

> Currently beta is called a feature freeze but effectively it isn't. The main 
> issue with that is that the end of alpha just means that all RFC's targeting 
> that version must have voting finished but the implementation can be done 
> during beta. This is however a major inconsistency because RFC 
> implementations are often those that can have a major impact on API and ABI 
> stability so it seems illogical to allow that but don't allow minor 
> improvements that do not require RFC.

I think the general expectation with the current process is that an
RFC implementation would be merged reasonably shortly after feature
freeze. Extending this to the entire beta period may be risky,
especially if we're also shortening the RC period. With an
implementation merged last-minute, this gives us 2 months to iron out
any issues. With multiple RFCs merged last-minute things could become
quite overwhelming.

I also think library maintainers might want more than 2 months time to
make and test their changes, especially when it comes to BC
incompatible changes.

I'm not sure how this can be addressed. We should at least encourage
changes to be merged early in the beta cycle.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Previous discussions about generics syntax only?

2023-11-03 Thread Ilija Tovilo
Hi Daniil

On Fri, Nov 3, 2023 at 12:00 AM  wrote:
> The much better approach, one that I intend to maybe give a shot at this 
> Christmas, is to add static analysis functionality to PHP itself (i.e. turn 
> it into a truly statically typed language).
> I have a hunch it may be easy enough to do by hooking into the type inference 
> functionality provided by opcache, and throw compile-time exceptions instead 
> of silently inserting runtime typechecks.

The optimizer, including type inference, is limited to the scope of
the current file (along with internal functions/classes). Each file is
considered a "single compilation unit". When classes from different
files reference each other, modifying one file does not require
recompiling the other. However, this does mean that we cannot rely on
information from other files as they may change at any point.
Preloading is the exception, where all preloaded files can assume not
to be changed after PHP has booted.

We can obviously not limit type checking to preloaded files. We could
make type checking a CLI step but how is that really better than
PHPStan or Psalm at that point, other than having the official PHP
stamp? PHPStan and Psalm are arguably successful *because* they are
written in PHP, making them much easier to maintain and contribute to.

I'd also like to add that tools like PHPStan and Psalm have much more
accurate type representations. PHP does not accurately represent
arrays in the optimizer and has no notion of array shapes. The
optimizers types are biased towards speed rather than accuracy.

Another issue, specifically pertaining to generics, is that PHP has
type coercion. In both weak and strict typing mode, a float function
parameter will coerce an integer value. However, if generic types are
erased at runtime then the VM cannot do coercion for foo($int)
(where function foo(T $var)). This will require either accepting
inaccurate runtime types, or establishing stricter static rules that
do not match the existing behavior.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Re: [RFC][Under discussion] RFC1867 for non-POST HTTP verbs

2023-10-31 Thread Ilija Tovilo
Hi Derick

On Wed, Oct 18, 2023 at 6:10 PM Derick Rethans  wrote:
> On Fri, 13 Oct 2023, Ilija Tovilo wrote:
> > > https://wiki.php.net/rfc/rfc1867-non-post
>
> The only comment I would have is that I probably would be in favour of
> not leaving the "config" argument (to over ride per call the
> post_max_size settings etc) to a future scope.
>
> Having to do ini_get/ini_set/ini_set(old) to override these settings
> seems clunky.

As per request I've included an $options parameter to override the
relevant INI values for the duration of the function call.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Discussion - Anti-null coercion

2023-10-30 Thread Ilija Tovilo
Hi Robert

On Sun, Oct 29, 2023 at 7:31 PM Robert Landers  wrote:
>
> Hello Internals,
>
> We currently have a null coercion operator: ??, but we lack an
> anti-null coercion operator.
> ...
> fn() =>
>   ($_SERVER['HTTP_X_MY_HEADER'] ?? null)
>   ? md5($_SERVER['HTTP_X_MY_HEADER'])
>   : null;
> ...
> This is rather tedious when you have to do it, so, I'd like to discuss
> adding a new "anti-null coercion" operator: ?!
>
> This would collapse the previous verbose code into:
>
> fn() =>
>   $_SERVER['HTTP_X_MY_HEADER']
>   ?! md5($_SERVER['HTTP_X_MY_HEADER'];

This does not seem significantly less verbose to me. The main
motivation for ?? was that it avoids repeating the expression over
something like ?:. I would see a stronger argument for this feature if
it offered the same benefit. E.g.

$_SERVER['HTTP_X_MY_HEADER'] ?! md5($$)

> It would have a lower precedence than ?? so that the above line would
> read from left to right without requiring parenthesis/brackets. The
> operator would only return the right-hand side if the left-hand side
> exists (aka, not null), otherwise, it would return null.

I think it should have a higher precedence.

$_SERVER['HTTP_X_MY_HEADER'] ?! md5($$) ?? 'abc'
==>
($_SERVER['HTTP_X_MY_HEADER'] ?! md5($$)) ?? 'abc'

Otherwise the result is NULL if the header is missing, given that the
coalesce operator is never executed.

That said, while I've certainly encountered this situation, it's
nothing a temporary variable can't fix. I don't personally believe
there's a strong need for such an operator.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Request of RFC karma

2023-10-27 Thread Ilija Tovilo
Hi Alessandro

Sorry for the delayed response.

On Wed, Oct 25, 2023 at 4:11 PM Alessandro Rosa
 wrote:
> I would like receiving the wiki RFC Karma for my account, in order to
> submit an RFC on the built-in "empty" function.
> My account is alessandro.a.rosa_gmail.com

RFC karma was granted. Good luck on the RFC!

Note that it would still be helpful if you followed step 1. from the howto:
https://wiki.php.net/rfc/howto

> Email internals@lists.php.net to measure reaction to your intended proposal. 
> State who would implement the feature, or whether the proposal is only a 
> “concept”. Proceed with an RFC if feedback is not negative or if a detailed 
> RFC will clarify the proposal.

A short summary of the problem and how the new functionality will
solve it should suffice. The feedback will give you a better idea of
whether the RFC is worth pursuing, whether your idea may be improved,
and what details need to be specified.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Re: [RFC][Under discussion] RFC1867 for non-POST HTTP verbs

2023-10-19 Thread Ilija Tovilo
Hi Larry

On Wed, Oct 18, 2023 at 7:26 PM Larry Garfield  wrote:
> > On Fri, Oct 6, 2023 at 3:44 PM Ilija Tovilo  wrote:
> >> https://wiki.php.net/rfc/rfc1867-non-post
>
> The functionality all seems reasonable to me.  I have a few smaller concerns:
>
> 1. Like Derick, I think I'd favor including the config overrides now.

I will check if this can be implemented without too many changes.

> 2. Lots of request bodies are not forms these days; they're frequently JSON 
> or GraphQL.  This function would be useless in those cases; that's fine, but 
> should the name then suggest that it's for form data only?  
> request_parse_form() or similar?  I'm just concerned about misleading people 
> into thinking it can parse their JSON bodies, when that's not a thing.  
> (Unless we wanted to provide some kind of callback mechanism, which is 
> probably overkill here.)

request_parse_body() is indeed not aimed at other content types as-is.
A generic name would allow extending the function to support other
content types in the future, although it's currently unclear whether
that's desirable. E.g. for JSON, people might be confused why there's
a file index in the returned array that is always empty.

> 3. For an unsupported mime type, I'd recommend a more specific exception than 
> InvalidArgumentException.  Give that a custom sub-class that tracks what the 
> actual mime type was that the request had and it rejected.

A custom exception class sounds reasonable. The mime-type is contained
in the exception message.

> 4. I don't quite grok the "input" section.  So if I don't disable the 
> automatic parsing, does that mean request_parse_body() will always fail?  Or 
> will it still work, but just be more memory-wasteful?  That's not clear to 
> me; I'd prefer if it works but is just memory-wasteful, personally, as that 
> would be more portable for projects that want to use it.

Whether request_parse_body() can work repeatedly depends on whether
the input has been buffered. application/x-www-form-urlencoded is
buffered and as such works multiple times. multipart/form-data *can*
be buffered if done manually by opening php://input and reading the
whole input before calling request_parse_body(). Yes, for post this
means one needs to disable enable_post_data_reading.

Since files are stored on disk, buffering files means doubling disk
load. Uploading a 2GB file would require at least 4GB of disk space. I
don't think that's a good trade-off as there shouldn't be a reason to
call this function twice.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC Karma request

2023-10-17 Thread Ilija Tovilo
Hi Daniil

On Tue, Oct 17, 2023 at 8:25 PM Daniil Gentili  wrote:
> I'd like to create RFCs at least for final anonymous classes
> (https://externals.io/message/121356) and a small tweak to JIT defaults
> (https://externals.io/message/121359).
>
> Please give me RFC Karma :)
>
> Username: danog

RFC karma was granted, good luck with your RFC!

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC Karma Request

2023-10-17 Thread Ilija Tovilo
Hi Yuya

RFC karma was granted, good luck with your RFC!

Ilija

On Tue, Oct 17, 2023 at 5:07 PM youkidearitai  wrote:
>
> Hi, internals.
>
> I writing to trim for multibyte support function, mb_trim, mb_ltrim
> and mb_rtrim.
> https://github.com/php/php-src/pull/12459
>
> Please give me RFC Karma.
> Username: youkidearitai
>
> Regards.
> Yuya
> --
> ---
> Yuya Hamada (tekimen)
> - https://tekitoh-memdhoi.info
> - https://github.com/youkidearitai
> -
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: https://www.php.net/unsub.php
>

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] Re: [RFC][Under discussion] RFC1867 for non-POST HTTP verbs

2023-10-13 Thread Ilija Tovilo
Hi everyone

On Fri, Oct 6, 2023 at 3:44 PM Ilija Tovilo  wrote:
> https://wiki.php.net/rfc/rfc1867-non-post

Thank you for the feedback so far. I made a handful of changes to the RFC.

* The function is renamed to request_parse_body()
* The function will now throw instead of emitting warnings when hitting limits
* The Configuration section was added show how parsing limits may be
modified per endpoint
* The php://input is explained better in relation to multipart parsing

Let me know if you have any more feedback.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] Re: [RFC][Under discussion] RFC1867 for non-POST HTTP verbs

2023-10-07 Thread Ilija Tovilo
Hi Marco

Please note that you have accidentally created a new thread. I'm
responding from the main thread.

> >>> On Fri, Oct 6, 2023 at 2:44 PM Ilija Tovilo  
> >>> wrote:
> >>> https://wiki.php.net/rfc/rfc1867-non-post
>
> Just wanted to mention that maybe this is a great opportunity to create a 
> request_ family and start with request_parse_post_data

Something like request_parse_body() could work. That should satisfy
both your and Michałs request. This naming extends nicely to other
formats if added later on.

This reminded me that it should be specified what happens when the
format is not supported (i.e. anything but multipart or urlencoded
formats). The automatically invoked behavior does nothing and leaves
the input stream unconsumed so that the PHP script can process it. For
the explicitly invoked version we should throw an exception to inform
the user that nothing happened.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][Under discussion] RFC1867 for non-POST HTTP verbs

2023-10-07 Thread Ilija Tovilo
Hi Jakub

>> https://wiki.php.net/rfc/rfc1867-non-post
>>
>
> It should probably explicitly mention that it uses the same inis like 
> max_input_vars, max_file_uploads and max_multipart_body_parts.

Indeed, I will mention that. Thank you.

> It's kind of strange function as I can't decide where it should be placed. I 
> think it might be better as a stream function if it accepts only stream. It 
> means it could go to stream funcs and be called stream_parse_post_data 
> instead but not sure about. But not 100% sure about it as it doesn't exactly 
> fit there. But seems better than html functions (where it's placed in the 
> current PR) as it has nothing to do with html IMHO.

TBH I have no idea how it landed there. When I created the PoC I just
threw it in somewhere, but it's indeed an odd place. I'll try to find
something better.

I don't think stream is a great place, as the common case of not
providing an input does not operate on streams, but on
sapi_module.read_post() directly.

I also don't think rfc1867.c is a great place as Ben suggested,
because the invoked function is actually agnostic to the exact format.
Instead, we use the existing functionality that chooses the parser
based on the content type, which includes
application/x-www-form-urlencoded.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][Under discussion] RFC1867 for non-POST HTTP verbs

2023-10-07 Thread Ilija Tovilo
Hi Tim

> On 10/6/23 15:44, Ilija Tovilo wrote:
> > https://wiki.php.net/rfc/rfc1867-non-post
> >
>
> Regarding the cleanup of the files, perhaps the files could be read into
> a `php://temp` stream
> (https://www.php.net/manual/en/wrappers.php.php#wrappers.php.memory)?
>
> While this would cause the function to be incompatible with $_FILES, I
> think it would make for a much nicer API and it would also automatically
> solve the cleanup problem.

php://temp would solve auto-cleanup of files nicely. However, whether
they are easier to work with will depend on what you're doing with the
file. The most common action after a file uploads is arguably to move
it to a permanent location using move_uploaded_file(). With a stream
the obvious way to achieve the same is stream_copy_to_stream().
However, as the stream already has a file backing (if big enough, at
least) this copy is unnecessary. Please correct me if there's
something I have missed.

I also would really like to avoid subtle differences between the
automatically and manually invoked files. Given that the overwhelming
majority will not use PHP with something like RoadRunner, I think it
makes more sense to add the special casing (i.e. deleting the files
manually) in the uncommon case, than for everybody else to adapt their
code for the uncommon case.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] [RFC][Under discussion] RFC1867 for non-POST HTTP verbs

2023-10-06 Thread Ilija Tovilo
Hi everyone

A while ago I wrote an e-mail about RFC1867 (multipart/form-data) not
being parsed by PHP for non-POST requests.
https://externals.io/message/120641

I'd like to announce an RFC that proposes adding a new function called
parse_post_data() to expose the existing functionality to userland, so
that the mechanism can be used for other HTTP verbs.

https://wiki.php.net/rfc/rfc1867-non-post

As opposed to the semantics I suggested in the previous thread, this
proposal returns the parsed result instead of populating it directly
to the superglobals, and it accepts an optional input stream.

Let me know if you have any feedback.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] What should I do to create an RFC?

2023-10-01 Thread Ilija Tovilo
Hi Saki

On Sat, Sep 30, 2023 at 6:04 AM Saki Takamachi  wrote:
> I want to create an RFC. This is my first time.
>
> The next two pages each have sections on how to create RFCs. However, they 
> differ slightly in content. Which way should I use?
>
> https://wiki.php.net/rfc/howto
> https://wiki.php.net/rfc/voting
>
> I already have a wiki account.
> Is requesting karma the right way? Or is requesting membership the right 
> answer?

I've granted you RFC karma. Good luck!

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Wiki account for RFC

2023-09-24 Thread Ilija Tovilo
On Sat, Sep 23, 2023 at 3:36 PM Marc  wrote:
>
> Hi,
>
> I like to create an RFC for integer rounding as proposed here
> https://externals.io/message/120373
>
> I already had an account for the wiki years ago but somehow lost my login.
>
> Username: mabe
>
> EMail: php@mabe.berlin
>
> Also I don't get an email on trying https://wiki.php.net/start?do=resendpwd
>
> Please could you help me what I have to do to back login.

Hi Marc

It seems you're using the wrong e-mail address. I sent you a private
e-mail with the name of the address, in case it isn't public.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] [RFC][Draft] Match block

2023-09-08 Thread Ilija Tovilo
Hello everyone

I've been working on match blocks over the last few weeks.
https://wiki.php.net/rfc/match_blocks

I've already shared it in R11 and got conflicting feedback, which
makes me unsure on how to proceed. We have a few options.

1. Add blocks only to match, possibly adding blocks to other
constructs in separate RFCs (the approach of this RFC)
2. Support block expressions as a language-level concept, analogous to
https://doc.rust-lang.org/reference/expressions/block-expr.html
3. Do nothing

The two main complaints/questions I've gotten was whether this
approach is the right one, and whether the syntax can be improved. The
RFC tries to go into detail on explaining the rationale for the chosen
approach. Additionally, it proposes a few alternate syntax options,
although none of them are very satisfactory.

At this point I'm unsure whether a proposal can satisfy all camps. Let
me know if you have any thoughts/ideas.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Access property of object stored in a constant

2023-08-19 Thread Ilija Tovilo
Hi Juliette

> > Since https://wiki.php.net/rfc/new_in_initializers we can store
> > objects in global constants. However, we may not actually read or
> > write to the properties of those objects without first fetching the
> > constant into a local variable.
> >
> > const a = new stdClass;
> > a->b = 42; // Fatal error: Cannot use temporary expression in write context
> > $a = a;
> > $a->b = 42; // Works fine
> >
> > This issue was reported twice, so it seems like this code is generally
> > expected to work.
> > https://github.com/php/php-src/issues/10497
> > https://github.com/php/php-src/issues/11781
> >
> > I have created a patch here to add support for this syntax:
> > https://github.com/php/php-src/pull/11788
>
> I totally understand that people are trying to do this, but this still
> very much feels like scope creep.
>
> IIRC the new in initializers feature was _intended_ only for enums
> (which can't take properties). Now suddenly a "constant" would no longer
> be constant... In which case, what's the point of declaring it as a
> constant ?

This patch doesn't change the mutability of objects in constants, as
they already don't offer interior immutability. https://3v4l.org/s7rHE
This is analogous to `const` in JavaScript or `readonly` properties in
PHP, where we can't change the value of the variable (or const in this
case), but we can modify the properties of the object it's pointing
to.

I believe the main motivation for `new` in constant expressions was to
support nested attributes. Enums have their own mechanism for
instantiating cases. Since it was decided to expand the support for
`new` to global constants, I would expect it to work with other
language constructs, unless there's a good reason for it not to.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] Access property of object stored in a constant

2023-08-18 Thread Ilija Tovilo
Hi everyone

Since https://wiki.php.net/rfc/new_in_initializers we can store
objects in global constants. However, we may not actually read or
write to the properties of those objects without first fetching the
constant into a local variable.

const a = new stdClass;
a->b = 42; // Fatal error: Cannot use temporary expression in write context
$a = a;
$a->b = 42; // Works fine

This issue was reported twice, so it seems like this code is generally
expected to work.
https://github.com/php/php-src/issues/10497
https://github.com/php/php-src/issues/11781

I have created a patch here to add support for this syntax:
https://github.com/php/php-src/pull/11788

Since this is a language change I would like to ask for feedback
before merging. As always, if there are concerns I will create a small
RFC instead. I will also only merge this for 8.4, as feature freeze
for 8.3 has long passed.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Requesting RFC karma for athos

2023-08-01 Thread Ilija Tovilo
Hi Athos

On Sat, Jul 29, 2023 at 3:37 AM Athos Ribeiro  wrote:
> I hereby request RFC karma for my recently created wiki account name
> "athos".

I granted you RFC karma. Please let me know if it works, it's the
first time I did it so I'm not sure I did it right. :)

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] ??= and function calls

2023-07-05 Thread Ilija Tovilo
Hi Flávio

On Wed, Jul 5, 2023 at 12:17 PM Flávio Heleno
 wrote:
>> > I recently discovered some unfortunate behavior of the coalesce
>> > assignment operator (??=) in combination with function calls.
>
> Great catch Ilija!
>
> Do you mind sharing how did you stumble upon it?

Oh, it's unspectacular. oss-fuzz found a related problem
(https://github.com/php/php-src/pull/11581) which made me look into
the implementation of ??= which is when I noticed the strange
behavior.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] ??= and function calls

2023-07-04 Thread Ilija Tovilo
Hi everyone

I recently discovered some unfortunate behavior of the coalesce
assignment operator (??=) in combination with function calls. Here's
the TL;DR:

foo()['bar'] ??= 42;

Currently, this code calls foo() twice. This seems rather unexpected.
The technical reason as to why this happens is not straight-forward,
but I will attempt to explain below. The behavior was not specified in
the RFC (https://wiki.php.net/rfc/null_coalesce_equal_operator) and is
completely untested, and as such I don't believe it is by design. My
proposal is to change it so that foo() is only called once.

This is what is happening in detail.

??= is special in that it needs to evaluate the lhs (left hand side)
twice. At first, we need to check if the offset exists, then
conditionally execute the rhs (right hand side), re-fetch the offset
and assign the rhs value to it. The reason for the re-fetching of the
offset is that the evaluation of the rhs may invalidate the offset.
This is explained in the following blog post:
https://www.npopov.com/2017/04/14/PHP-7-Virtual-machine.html#writes-and-memory-safety
Essentially, the offset may be a pointer into an array element or
object property. If the rhs frees the array or object, or grows the
array causing a reallocation (meaning it is moved to some other place
in memory), the pointer is no longer valid. For this reason, PHP makes
sure no user code may execute between the fetching of an offset and
the assignment to it. Normally, that just means evaluating the rhs
before fetching the offset. In this case, we need to evaluate the lhs
first to know if we even should evaluate the rhs.

Naively evaluating the lhs again poses a problem for expressions with
side-effects. For example:

$array[$x++] ??= 42;

We do not want to re-evaluate the entire expression because $x++ will
lead to a different array offset the second time around. The way this
is solved is by "memoizing" any compiled expression in the lhs that is
*not* a variable, meaning not part of the offset that may be
invalidated. Internally, a variable is considered anything that may be
written to, i.e. local variables ($foo), properties ($foo->bar,
Foo::$bar), array offsets ($foo['bar']), and function calls (foo(),
$foo->bar(), Foo::bar(), $foo(), as they may return a modifiable
reference). The fact that function calls are included in that list
leads to the problem presented above. It is not actually necessary to
exclude them from memoization because their result may not be
invalidated.

Another inconsistency is that function call arguments will be
re-evaluated, but only if they are not part of some other expression.

a. foo(bar())['baz'] ??= 42;
b. foo(bar() + 0)['baz'] ??= 42;

a calls both foo() and bar() twice. b however calls foo() twice but
bar() only once. That is because the expression bar() + 0 is *not*
considered a variable and as such gets memoized.

I propose to unconditionally memoize calls (in all forms) when they
appear in the lhs of a coalesce expression. This will ensure that
calls are only executed once, including function arguments and the lhs
of method calls. Consequently, the assignment will be performed on the
same offset that was previously tested, even if the expression
contains a function call with side-effects.

The implementation for this change is simple:
https://github.com/php/php-src/pull/11592

Let me know if you have any concerns. I'm planning on merging this for
master if there is consensus on the semantics.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC1867 (multipart/form-data) PUT requests

2023-06-28 Thread Ilija Tovilo
Hi Ben

On Tue, Jun 27, 2023 at 9:54 PM Ben Ramsey  wrote:
>
> > On Jun 27, 2023, at 04:01, Ilija Tovilo  wrote:
> >
> > Hi Ben, Hi Rowan
> >
> > On Mon, Jun 26, 2023 at 8:55 PM Ben Ramsey  wrote:
> >>
> >>> On Jun 20, 2023, at 06:06, Rowan Tommins  wrote:
> >>>
> >>> On Tue, 20 Jun 2023 at 10:25, Ilija Tovilo  wrote:
> >>>
> >>>> Introduce a new function (currently named populate_post_data()) to
> >>>> read the input stream and populate the $_POST and $_FILES
> >>>> superglobals.
>
> In the past, I’ve used something like the following to solve this:
>
> parse_str(file_get_contents('php://input'), $data);
>
> I haven’t looked up how any of the frameworks solve this, but I would be 
> willing to bet they also do something similar.
>
> Rather than implementing functionality to populate globals, would you be 
> interested in introducing some new HTTP request functions. Something like:
>
> http_request_body(): string
> http_parse_query(string $queryString): array
>
> `http_request_body()` would return the raw body and would be the equivalent 
> of calling `file_get_contents('php://input')`. Of special note is that it 
> should _always_ return the raw body, even if `$_POST` is populated, for the 
> sake of consistency and reducing confusion.
>
> `http_parse_query()` would be the opposite of `http_build_query()` and would 
> return a value instead of requiring a reference parameter, like `parse_str()`.

The problem is that the content stream for multipart/form-data is
expected to be big, as in possibly multiple gigabytes big. We can't
use http_request_body() to return the entire content as a string at
once. The current RFC1867 implementation reads and operates in chunks,
i.e. appends it to a file or to a string, depending on the content
part. It never has to hold on to the entire content in memory.
http_request_body() also can't return the content of the request again
after it has been consumed, because that's not how the HTTP protocol
works. We would need to buffer the content somewhere when reading it
for the first time, which again we can't do because it may be very
big.

It may be possible to pass the fopen('php://input', 'r') stream to
this function and let it consume it. However, as mentioned in my
original e-mail this requires some changes to how RFC1867 requests are
handled. Currently, it calls sapi_module.read_post() which directly
reads from the TCP socket. Instead, we'd need to read from the stream,
possibly in addition so that the general case is not degraded in terms
of performance. I'll verify if this is an option, and whether the
changes are (too) big. However, I don't suspect there to be a lot of
use-cases for this as RFC1867 is primarily used for requests and not
for responses, so you wouldn't usually need to parse this type of
content from some other source.

As for returning the parsed values as non-globals, that's entirely
possible. However, it's inconsistent with how requests are currently
handled. The values will need to be passed around manually and kept
alive, but the function still modifies global state (i.e. the input
stream, whether that's sapi_module.read_post() or php://input). I
don't believe it will be common to call this function more than once
per request, and thus decoupling the state is not really necessary.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC1867 (multipart/form-data) PUT requests

2023-06-27 Thread Ilija Tovilo
Hi Ben, Hi Rowan

On Mon, Jun 26, 2023 at 8:55 PM Ben Ramsey  wrote:
>
> > On Jun 20, 2023, at 06:06, Rowan Tommins  wrote:
> >
> > On Tue, 20 Jun 2023 at 10:25, Ilija Tovilo  wrote:
> >
> >> Introduce a new function (currently named populate_post_data()) to
> >> read the input stream and populate the $_POST and $_FILES
> >> superglobals.
> >
> > How about "request_form_populate_globals"?

The word "form" seems a bit out of place (even though it appears in
both multipart/form-data and application/x-www-form-urlencoded),
because this function is mainly targeted at PUT/PATCH requests for
REST APIs. Maybe request_body_populate_globals?

> Another option for the name: `populate_multipart_form_data()`.

I avoided the term "multipart" because the function technically also
works for application/x-www-form-urlencoded requests. It's less
necessary for the reasons outlined in my previous email, but it would
allow for consistent handling of such requests for all HTTP methods.

Some people on GitHub voiced that they would prefer an INI setting.
Therefore I will create an RFC accordingly.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Request to create a RFC for const args

2023-06-26 Thread Ilija Tovilo
Hi Sam!

> Hello php internals team,
> I would like to put forward an RFC for a new feature. I am a long time user, 
> but have never participated as yet to the RFC.

Sorry for the late reply.

Thanks for your suggestion! If you'd like RFC karma, we'll need to
know your username on the wiki.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] RFC1867 (multipart/form-data) PUT requests

2023-06-20 Thread Ilija Tovilo
Hi internals

A while ago I encountered a limitation of how RFC1867 requests are
handled in PHP. PHP populates the $_POST and $_FILES superglobals when
the Content-Type is multipart/form-data or
application/x-www-form-urlencoded, but only when the method is POST.
For application/x-www-form-urlencoded PUT requests this is not a
problem because the format is simple, usually limited in size and PHP
offers functions to parse it, namely parse_str and parse_url. For
RFC1867 it's a different story.

The code handling the request will need to use streams because RFC1867
is often used with files, the format is much more complicated, files
should be cleaned up when the request ends if unused, etc. Handling
this manually is non-trivial. This has been reported many years ago,
and evidently caused a bit of frustration.
https://bugs.php.net/bug.php?id=55815

This is not limited to PUT either, multipart/form-data bodies are
valid with other requests. Here's the approach I believe is best.

Introduce a new function (currently named populate_post_data()) to
read the input stream and populate the $_POST and $_FILES
superglobals. The function works for any non-POST requests. It assumes
that none of the input stream has been consumed, and that the
Content-Type is set accordingly. A nice side-effect of this approach
is that it may be used with the enable_post_data_reading ini setting
to decide whether to parse the RFC1867 bodies dynamically. For
example, a specific endpoint may accept bigger requests. The function
may be implemented in a more generic way 1. by returning the
data/files arrays instead of populating the superglobals and 2. by
providing an input stream manually. I don't know if there's such a
use-case and thus if this is worthwhile, as it would require bigger
changes in the RFC1867 handling.

Here's the proof-of-concept implementation:
https://github.com/php/php-src/pull/11472

For completeness, here are other options I considered.

1. Create a new $_PUT superglobal that is always populated. Two
issues: The obvious one is that this is limited to PUT requests. While
we could also introduce $_PATCH, this seems like a poor solution.
While discouraged, other methods can also contain bodies. Another
issue is that the code for processing RFC1867 consumes the input
stream. This constitutes a BC break. Buffering the input is not
feasible for large requests that would be expected here.
2. The same as option 1, but populate the existing $_POST global. This
comes with the same BC break.
3. The same as options 1 or 2 with an additional ini setting to opt
into the behavior. The issue with this approach is that both the old
and new behavior might be desired in different parts of the same
application. The ini option can't be changed at runtime because the
populating of the superglobals happens before user code is being
executed.

Let me know what your thoughts are. If there is consensus in the
feedback I'll update the implementation accordingly and post an update
to the list. If there is no consensus, I will create an RFC.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Expression code blocks

2023-06-17 Thread Ilija Tovilo
Hi Andreas

On Fri, Jun 16, 2023 at 9:23 PM Andreas Hennings  wrote:
>
> Hello list,
> I don't know if something like this was already proposed in the past,
> I did not find anything.
>
> Sometimes it would be nice to have a code block inside an expression, like 
> this:
>
> public function f(string $key) {
> return $this->cache[$key] ??= {
> // Calculate a value for $key.
> [...]
> return $value;
> }
> }

This has been discussed a few years back when match expressions were
proposed. I originally wanted to include support for code blocks along
with expressions to offer a more complete alternative to switch
statements. The other major use-case for block expressions are arrow
functions.

Unfortunately, a general solution seems suboptimal due to the subtle
semantic differences. See this message for my detailed thought
process.

https://externals.io/message/109941#109947

I believe it would be best to address blocks for match arms and arrow
functions separately.

I don't believe blocks for general expressions are that useful in PHP
due to the lack of block scoping. Your suggestion to make the block a
separate closure could avoid that (as well as the optimizer issue
mentioned below) but comes with new issues, like making modification
of captured values impossible without by-ref capturing. It seems
confusing that fn {} is auto-executed while fn() {} isn't, as the
former looks like a shortened version of the latter. fn() => fn {}
would also look quite weird. match ($x) { 1 => fn {} } seems ok,
except for being somewhat lengthy.

On another note, the vote for blocks in short closures has failed
lately (https://wiki.php.net/rfc/auto-capture-closure).

The message above also addresses the syntax ambiguity you mentioned.
The {} syntax would be unambiguous in the most useful contexts (e.g.
function parameters, match arms, arrow function bodies, rhs of binary
operators, etc.). It is ambiguous in the general expression context
due to expression statements (statements containing a single
expression followed by `;`), where it's unclear (without lookahead)
whether the `{` refers to a statement block or a block expression.
Replacing all statement blocks with block expressions comes with the
added difficulty of allowing to omit the `;` of block expressions in a
expression statement.

I remember there also being issues with the optimizer (related to
https://www.npopov.com/2022/05/22/The-opcache-optimizer.html#liveness-range-calculation).
The details went over my head at the time.

I'm interested in picking this back up at some point, at least for match arms.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Interface Default Methods

2023-06-15 Thread Ilija Tovilo
Hi Levi

> I am moving my RFC for interface default methods to discussion:
> https://wiki.php.net/rfc/interface-default-methods.

This or a similar concept makes sense to me. The proposal seems
similar to Swift protocol extensions, or Rust traits, with the
exception that function default implementations may only be defined in
the interface itself.

Note that there's a large overlap between this proposal and extending
traits to allow implementing interfaces.
https://wiki.php.net/rfc/traits-with-interfaces The main difference is
how you would use the feature from a given class, i.e. using an
interface implementation or a trait usage. Implementing interfaces
from traits would require declaring both a trait and an interface. I
do think your proposal is the more natural approach.

The redundancy of interfaces and traits after this RFC are also
somewhat unfortunate. Both interfaces and traits could inject default
behavior into classes. Both could enforce implementation of methods in
classes (traits through abstract methods). My intuition is that
interface default implementations should be used for public APIs
(because this provides an abstracted interface), while traits should
be used for protected/private ones (because non-public methods can't
be added to interfaces). The other obvious difference is that
interfaces don't allow manual conflict resolution while traits do.

The RFC doesn't mention default implementations for static methods.
I'm not sure there's a use case but it might make sense to explicitly
mention whether they are supported.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC [Discussion]: Marking overridden methods (#[\Override])

2023-05-22 Thread Ilija Tovilo
Hi Tim

On Thu, May 11, 2023 at 6:37 PM Tim Düsterhus  wrote:
> I'm now opening discussion for the RFC "Marking overridden methods
> (#[\Override])":
>
> RFC: Marking overridden methods (#[\Override])
> https://wiki.php.net/rfc/marking_overriden_methods

We've already talked in private, but let me state my position here as well.

The implementation is quite simple (~60 lines of non-whitespace,
non-generated C code), and does not introduce any new syntax that
parsers/static analyzers are *forced* to handle.

The RFC shows that there is a benefit for code using the attribute,
namely showing intent to the reader, reducing the risk of typos and
being more resistant to errors when refactoring / upgrading library
versions. Having the feature supported by the engine, while not
strictly necessary, allows users who don't use static analyzers to
profit from it, and define consistent semantics for static analyzers
to follow.

The benefits seem worth the maintenance cost, even if small for the
average user.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] PHP Package for PHP

2023-05-18 Thread Ilija Tovilo
Hi Marco

On Thu, May 18, 2023 at 7:35 PM Rowan Tommins  wrote:
>
> On Thu, 18 May 2023 at 16:27, Deleu  wrote:
>
> Monolog is a great example of what PHP is missing - a single library for a
> > purpose. I have never worked with any other library besides Monolog and I
> > never worked on any project which didn't have it installed. Perhaps my
> > bubble might be a limiting factor here, but I get a feeling that Monolog is
> > considered to be The Logging PHP Library.
> >
>
>
> Then in what sense is it "missing"? What value would be served by placing
> an elephant logo on it, and renaming it "PHPLog™"?
>
> I know that's a bit of a sarcastic response, but it's also a serious one -
> what would we define as the aims of a replacement for Monolog, which aren't
> currently being served?
>
> We could guarantee it was installed with every version of PHP, but only by
> severely restricting its release cycle, so that every PHP version had
> exactly one version of Monolog. If it remains an independently versioned
> Composer package, I can't think of much that would change.

I fully agree with Rowan. These packages have had many years to get to
where they are, and are maintained by capable people. Putting an
official stamp on them doesn't make them qualitatively better.

I could see an argument being made for bundling them with PHP so that
Composer is not required, but that does not seem smart for large
libraries that need the freedom to evolve. I'm absolutely in favor for
a more complete standard library in terms of basic operations, e.g.
better iterator support.

> > Laravel's `Arr` class also didn't get scrutinized by PHP RFC so there's no
> > way to know whether it's all good, some good or all bad.
> >
>
>
> I don't think PHP's decision-making process can be held up as a shining
> example of good governance, in contrast to everyone else's anarchy. I don't
> know much about Laravel's governance, but I am quite sure every change is
> discussed and iterated on before release. In fact, they probably have a
> whole bunch of standards and processes that PHP is lacking, and would have
> to invent to make any new library a success.

Moreover, I believe the main barrier for adding new functions to PHPs
standard library is not C but the RFC process itself. "Trial and
error" is much easier than making 50+ people agree on the correct
solution on the first try.

I suppose something we could try is an "official but experimental"
Composer package for testing new classes/functions for the standard
library before stabilizing them and rewriting them in C. Maybe this
could prevent some of the bike-shedding.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Final anonymous classes

2023-04-25 Thread Ilija Tovilo
Hi Claude

> > Hi all,
> >
> > I've submitted https://github.com/php/php-src/pull/11126 to add support for 
> > final anonymous classes, though as noted by iluuu1994, it would probably 
> > make more sense to just make all anonymous classes final by default, what 
> > do you think?
>
> Extending an anonymous class is indeed possible (https://3v4l.org/pDFTL), but 
> it is a hack as best. If someone wants a non-final class, could they not 
> write a non-anonymous one? As a bonus, they wouldn’t need to instantiate the 
> class before referencing it.

Indeed. The argument was that, if you need to give the anonymous class
a dedicated name through an alias to extend it, you might as well
declare a named class in the first place.

In case somebody finds benefit in making anonymous classes open, it
seems more sensible to make them opt into openness, rather than
applying this behavior to all anonymous classes that are used as final
99.9% of the time. Although I really don't think that is necessary.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Expansion of PHP Symbols?

2023-04-21 Thread Ilija Tovilo
Hi Deleu

> From Rob's email (https://externals.io/message/120094#120097), the argument
> against a simple "use" statement seems quite natural. I certainly don't
> want to redefine "use int|float as Number" in every PHP file I work with,
> so naturally we would go back to type alias definition, symbol registration
> and autoloading. So I guess my final question is: what is fundamentally
> different about Type Alias when compared to interfaces, classes, enums that
> make this controversial?

I don't think autoloading is the fundamental issue with type aliases,
nor are the symbol tables. Enums live in the class symbol table, as
they are just classes. Type aliases don't need most things classes
need, but they could live there too with a discriminator flag if we're
ready to waste that space for "convenience" of not rewriting all
accesses to the class table.

I believe the bigger issue is typing itself. There are multiple complications.

* Currently, any name that is not a known named type is considered a
class type. With type aliases this assumption is no longer correct,
which may require many changes in the engine (and in the optimizer).
* Combinations of union and intersection types are limited at the
moment (Foo|Bar, Foo, (Foo)|Baz). With type aliases we can
nest types indirectly to create new combinations that were previously
disallowed on a syntax level. We'll either have to handle these
correctly (which from what I understand is quite complicated) or
disallow them at runtime.
* Type variance may be challenging. E.g. do we allow substituting a
type alias with its concrete types and vice versa? What about
substituting two equivalent typealiases? There are infinite
combinations.
* For runtime type checking itself we would need to compare the value
against the concrete types instead of the typealias, thus complicating
and slowing down the type check.

All of those could be solved (to some extent) by substituting the
typealias with the concrete types as early as possible and reusing the
existing type system. This is the approach I've tried some years ago:
https://github.com/php/php-src/compare/master...iluuu1994:php-src:typealias

The main issue with this approach is that classes/functions are
generally immutable (with OPcache) because we want to store them in
shared memory where all processes can access them. We have mechanisms
to make *parts* of the class/function mutable per request but
adjusting this for all types might once again require many code
changes. Furthermore, every type (with a typealias, at least) would
require copying to process space to substitute the typealiases with
the concrete type, for every request. This might or might not be
significant, it's hard to tell without measuring.

But the main reason why I stopped working on this was, what do we use
it for? Right now the main use cases are union and intersection types
which are fairly limited or short in my personal PHP code. A
reasonable use case might be closure types. However, I have become
increasingly sceptical whether runtime types for closures are the
direction we should take, as 1. they may be slow, hard to implement or
both and 2. most code doesn't *want* to add closures types that could
be inferred in most other typed languages.

This e-mail is not too structured and not exhaustive, let me know if
you have any more questions.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [Discussion] Callable types via Interfaces

2023-04-21 Thread Ilija Tovilo
Hi Larry and Nicolas!

> https://wiki.php.net/rfc/allow_casting_closures_into_single-method_interface_implementations
> https://wiki.php.net/rfc/allow-closures-to-declare-interfaces-they-implement
> https://wiki.php.net/rfc/structural-typing-for-closures
>
> What we propose is to instead lean into the interface approach.  
> Specifically, recall that all closures in PHP are actually implemented as 
> classes in the engine.  That is:
>
> $f = fn(int $x, int $y): int => $x + $y;
>
> actually turns into (approximately) this in the engine:
>
> $f = new class extends \Closure
> {
> public function __invoke(int $x, int $y): int
> {
> return $x + $y;
> }
> }

Just to comment on the technical aspect, I don't think this is
accurate. Closures are indeed objects, but they are all instances of
the same \Closure class. From what Nikita said in the enum RFC,
objects are optimized for size, classes are not. Having different
closures implement different interfaces does mean they probably all
need their own class, or type checks need to account for closures in
some alternative way.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Future stability of PHP?

2023-04-08 Thread Ilija Tovilo
Hi Stephan

> I'm sorry if this isn't the correct mailing list for that discussion but I
> couldn't find a more appropriate one where people actually know how the
> wind is
> blowing.

No worries, this seems like the appropriate place.

> Is there a way to tell which APIs and language features will be stable
> and which
> might get changed or removed in the future? That way I could restrict
> myself to
> a stable subset for long-running projects (5 to 10 years). But I realize
> that
> such guarantees are difficult (or contra productive) when a language is
> in flux
> and I can understand if no one wants to do something like that.

There's no such guarantee at the moment. Anything that is agreed upon
by the 2/3 majority of voters is subject to change.

> Some of my projects run for 5 to 10 years, in one case even 20 years.

There are companies that offer extended lifecycle support for PHP
versions that have officially reached EOL (examples being Zend or
TuxCare). I'm not sure if those will get you to that 10 year mark.
Note however that deprecation notices alone should not keep you from
upgrading to a newer version, as they don't have any runtime effect
with a correct production configuration and thus aren't considered
breaking. We generally postpone larger breaking changes to major
versions. In general, I do agree that we could minimize breaking
changes more, and limit them to the things that actually provide a
tangible benefit.

Sadly, there's a conflict of interest here. There are people who want
to keep running their existing websites without having to make any
changes, and there are people who are using PHP daily and would like
to see the language evolve. We would like to satisfy both of these
groups but that is difficult when they are often directly opposed. I
do think that if we only manage to satisfy the former, PHP will
gradually become less and less significant. That would be sad.

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Array spread append

2023-04-06 Thread Ilija Tovilo
Hi Michael

> I would like to open a discussion for 
> https://github.com/php/php-src/issues/10791 .
> [https://opengraph.githubassets.com/a23cb565cc8acac6a33ecab5d9ee68a46f046a1ffe215501673156e506695430/php/php-src/issues/10791]
> Array spread append · Issue #10791 · 
> php/php-src
> Description Currently spread operator can be used for almost anything. But 
> not for array append. I propose the following to be supported:  [1, 2]; $arr2 = [3, 4]; $arr[...] = $arr2; // ...
> github.com
> Appending N elements to an array is quite common language usage pattern and I 
> belive it should be supported natively for shorter syntax, language 
> consistency and performance.Hi Michael

There are a few questions that come to mind (there may be more).

* Are integer keys preserved? I'm assuming no, as otherwise it would
be the same as `$a + $b`.
* What is the return value of the expression `$a[...] = $b`? I'm
assuming $a after the additions of $b?
* How does it behave in combination with `ArrayAccess`? Throw? Call
`offsetSet` for each element?
* How does it interact with references? E.g. This is valid PHP code:
`assign_by_ref($a[])` (https://3v4l.org/qoJYn)
* How does it interact with undefined/null values? E.g. `$a[] = 42;`
works without declaring $a first.
* Is there a need for this? Given that `+` doesn't work with
sequential lists and `array_push($a, ...$b)` doesn't work with strings
I'd say possibly. `[...$a, ...$b]` works but requires duplication of
the array which in loops can be detrimental to performance.

Ilija

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] Re: [RFC][Vote] Arbitrary static variable initializers

2023-04-05 Thread Ilija Tovilo
Hi everyone

> I've opened the vote for the arbitrary static variable initializers RFC.
> As usual, the vote is open for two weeks and will be closed on 2023-04-04.
>
> https://wiki.php.net/rfc/arbitrary_static_variable_initializers

The arbitrary static variable initializer RFC has been accepted
unanimously with 25 yes votes.
Thanks to everyone who participated!

Ilija

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



  1   2   3   4   >