Re: [PHP-DEV] [RFC] Transform exit() from a language construct into a standard function
On 11 May 2024 15:43:19 BST, "Gina P. Banyard" wrote: >print, echo, include(_once) and require(_once) do not mandate their "argument" >to be passed within parenthethis, so making them functions does not simplify >the lexer/parser nor removes them as keywords. It's actually a much stronger difference than that: parentheses are not parsed as surrounding the argument lists for those keywords at all. A while ago, I added notes to the manual pages of each showing how this can lead to misleading code, e.g. one of the examples on https://www.php.net/print is this: print(1 + 2) * 3; // outputs "9"; the parentheses cause 1+2 to be evaluated first, then 3*3 // the print statement sees the whole expression as one argument echo has further peculiarities, because it takes an unparenthesised list of arguments, and can't be used in an expression. While it would probably have been better if those had been parsed like functions to begin with, changing them now would not just be pointless, it would be actively dangerous, changing the behaviour of existing code. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type
On 30 April 2024 11:16:20 GMT-07:00, Arvids Godjuks wrote: >I think setting some expectations in the proper context is warranted here. > >1. Would a native decimal type be good for the language? I would say we >probably are not going to find many if any people who would be against it. As I said earlier, I don't think that's the right question, because "adding a native type" isn't a defined process. Better questions are: Should a decimal type be always available? Does a decimal type need special features to maximise performance? Should we have special syntax for a decimal type? What functions should support a decimal type, or have versions which do? >2. Is there a need for it? Well, the whole world of e-commerce, accounting >and all kinds of business systems that deal with money in PHP world do not >leave any room for doubt - https://packagist.org/?query=money . The use >case is right there :) That's a great example - would a decimal type make those libraries redundant? Probably not - they provide currency and rounding facilities beyond basic maths. Would those libraries benefit from an always-available, high-performance native type? Certainly. Would they benefit from it having strong integration into the syntax and standard library of the language? Not really; there's a small amount of actual code dealing with the values. >4. Is it a lot of engine work? Only if we go for the maximum ambition, highly integrated into the language. > Is it worth it? I'm actually not convinced. >5. But BCMath/GMP/etc!!! Well, extensions are optional. Extensions are only optional if we decide they are. ext/json used to be optional, but now it's always-on. > They are also not as fast and they deal with strings. Not as fast as what? If someone wants to make an extension around a faster library, they can. And only BCMath acts directly on strings; other libraries use text input to create a value in memory - whether that's a PHP string or a literal provided by the compiler doesn't make much difference. I absolutely think there are use cases for decimal types and functions; but "I want a faster implementation" and "I want to add a new fundamental type to the language, affecting every corner of the engine" are very different things. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type
On 28 April 2024 07:47:40 GMT-07:00, Robert Landers wrote: >I'm not so sure this could be implemented as an extension, there just >isn't the right hooks for it. The whole point of my email was that "this" is not one single feature, but a whole series of them. Some of them can be implemented as an extension right now; some could be implemented as an extension by adding more hooks which would also be useful for other extensions; some would need changes to the core of the language. If the aim is "everything you could possibly want in a decimal type", it certainly can't be an extension; if the aim is "better support for decimals", then it possibly can. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type
On 28 April 2024 07:02:22 BST, Alexander Pravdin wrote: >Hello everyone. To continue the discussion, I'm suggesting an updated >version of my proposal. This all sounds very useful ... but it also sounds like several months of full-time expert development. Before you begin, I think it will be really important to define clearly what use cases you are trying to cater for, and who your audience is. Only then can you define a minimum set of requirements and goals. It seems to me that the starting point would be an extension with a decimal type as an object, and implementations for all the operations you want to support. You'll probably want to define that more clearly than "anything in the language which takes a float". What might seem like it would be the next step is converting the object to a "native type", by adding a new case to the zval struct. Not only would this require a large amount of work to start with, it would have an ongoing impact on everyone working with the internals. I think a lot of the benefits could actually be delivered without it, and as separate projects: - Optimising the memory performance of the type, using copy-on-write semantics rather than eager cloning. See Gina's recent thread about "data classes". - Overloading existing functions which accept floats with decimal implementations. Could potentially be done in a similar way to operator overloads and special interfaces like Countable. - Convenient syntax for creating decimal values, such as 0.2d, declare(default_decimal), or having (decimal) casts affecting the tree of operations below them rather than just the result. This just needs the type to be available to the compiler, not a new zval type - for instance, anonymous function syntax creates a Closure object. There may be other parts I've not mentioned, but hopefully this illustrates the idea that "a native decimal type" doesn't have to be one all-or-nothing project. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC][Vote] Property Hooks
On 15/04/2024 17:43, Larry Garfield wrote: The vote for the Property Hooks RFC is now open: https://wiki.php.net/rfc/property-hooks Voting will close on Monday 29 April, afternoonish Chicago time. I'm somewhat conflicted on this one. On the one hand, I think the feature will be very powerful, and it's clear a lot of effort has been put into designing something that fits with the language. On the other hand, however, I share the concerns some have expressed that it is a very complex proposal. I would have more enthusiastically supported one which left out a few "bells and whistles", or moved them to Future Scope to be polished and agreed separately. I hope I was consistent in expressing that during the discussion phase. For that reason, please consider this an "abstention": while I'm reasonably happy for the RFC as written to pass, I am not going to cast a Yes vote myself. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC] [Discussion] #[\Deprecated] attribute again v1.3
On 26 April 2024 09:40:57 BST, Mike Schinkel wrote: >Given a lack of agreed definition for 'since' it appears you are using narrow >assumptions about the meaning of 'since' that led you to view 'since' as >useless. I can't see any ambiguity in the definition: "This function has been deprecated since version 7.2" seems a straightforward English sentence, meaning that before 7.2 it wasn't deprecated, and from that version onward it is. If there's some alternative reading of it, it's not that I'm assuming it doesn't apply, it's that I'm completely unaware of what it might be. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC] [Discussion] #[\Deprecated] attribute again v1.3
On 25 April 2024 22:01:35 BST, Mike Schinkel wrote: >> On Apr 25, 2024, at 11:28 AM, Rowan Tommins [IMSoP] >> wrote: >> If the project has no clear deprecation policy, the information is useless >> anyway. > >Not true. > >Having standardized notation for deprecation would allow tooling to analyze a >codebase and determine if it contains deprecated code that needs to be >remediated without having to run the code with full coverage. I think you missed the context of that sentence - or I'm missing something in yours. I meant specifically that the "deprecated since" information is useless if there's no published policy on how long something will stay deprecated. I think the "deprecated" attribute itself is definitely useful. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC] [Discussion] #[\Deprecated] attribute again v1.3
On 25/04/2024 08:40, Stephen Reay wrote: If you're on X.y and it says it was deprecated in X.w you know you don't need to worry about it being removed until at least Y.a. Yeah, that's the reasoning given in the Rust discussion, but I don't find it convincing. If the project's deprecation policy is that deprecations will be removed in the next major version, the information is redundant: if you get the deprecation message in 2.x, you know it will be removed in 3.0 If the project has some other deprecation policy, like "after 1 full major version cycle", then you can work out that "since: 2.3" means removal in 4.0; but the person adding the attribute also knows that, and could save the reader some effort by writing "planned removal: 4.0" If the project has no clear deprecation policy, the information is useless anyway. If you wanted it to be clearer I'd suggest maybe rename "since" to "version", but that's more to give a hint at intended use than anything. I don't think there's anything *unclear* about "since", I just don't think it's very *useful*. But apparently it's common to write it, so I guess I'm in the minority. Naming it "version" would just make it less clear, and not resolve anything from my point of view. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC] [Discussion] #[\Deprecated] attribute again v1.3
On 24 April 2024 18:18:28 BST, Jorg Sowa wrote: > What about setting this parameter vaguely as the boolean we can pass? > ... > #[Deprecated(since: $packageVersion > 5.5)] > #[Deprecated(since: PHP_VERSION_ID > 80100)] > #[Deprecated(since: date("Y-m-d") > "2024-01-21")] Even if these expressions were legal, as far as I know, standard reflection doesn't give any access to the source code or AST of how the attribute was written, so this would just end up with a meaningless "$since = true", and some source code that might as well be a comment. To be honest, I'm not really sure what I'd do with the information in a "since" field even if it was there. If you were running PHP 7.4, what difference would it make to know that create_function was deprecated in 7.2, rather than in 7.1 or 7.3? The two relevant facts are when the suggested replacement was introduced (in case you need to support multiple versions); and what is the soonest that the deprecated feature will be removed. The second in particular is something I would like every deprecation message to include, rather than the vague "may be removed in a future version". I found this discussion of "since" in Rust's implementation, but don't find the arguments in favour particularly compelling: https://github.com/rust-lang/rfcs/pull/1270#issuecomment-138043714 Of interest, that discussion also linked to a related feature in Java, which could perhaps be added to a list in the RFC alongside the Rust and JetBrains ones already mentioned: https://openjdk.org/jeps/277 It's interesting to note, for instance, that both Java and Rust designers considered a specific "replacement" field, but decided that it was unlikely to be useful in practice. The Java proposal states this nicely: > In practice, there is never a drop-in replacement API for > any deprecated API; there are always tradeoffs and > design considerations, or choices to be made among > several possible replacements. All of these topics require > discussion and are thus better suited for textual > documentation. The JetBrains attribute *does* include a "replacement" argument, but it's heavily tied into a specific use case: it contains a template used for code transformation in the IDE. Both it and "since" are explicitly marked "applicable only for PhpStorm stubs". Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC][Vote announcement] Property hooks
On 10 April 2024 04:40:13 BST, Juliette Reinders Folmer wrote: * Whether a type can be specified on the parameter on `set` depends on whether the property is typed. You cannot declare `set(mixed $value)` for an untyped property, even though it would effectively be compatible. This is inconsistent with the behaviour for, for instance method overloads, where this is acceptable: https://3v4l.org/hbCor/rfc#vrfc.property-hooks , though it is consistent with the behaviour of property overloads, where this is not acceptable: https://3v4l.org/seDWM (anyone up for an RFC to fix this inconsistency ?) Just picking up on this point, because it's a bit of a tangle: PHP currently makes a hard distinction between "typed properties" and "untyped properties". For instance, unset() works differently, and the "readonly" attribute can only be added to a typed property. That's actually rather relevant to your point, because if this RFC passes we would probably need to consider that PHP has at least 4 types of properties: - dynamic properties (deprecated by default, but allowed with an attribute) - declared but untyped properties - typed properties - virtual properties But maybe 6, with: - untyped properties with hooks - typed properties with hooks Of course, most of the time, users aren't aware of the current 3-way split, and they won't need to think about all 6 of these variations. But there are going to be cases where documentation or a future RFC has to cover edge cases of each. I do think there is scope for removing some features from the RFC which are nice but not essential, and reducing these combinations. For instance, if we limit the access to the underlying property, we might be able to treat "virtual properties" as just an optimisation: the engine doesn't allocate a property it knows will never be accessed, and accesses to it, e.g. via reflection, just return "uninitialized". I am however conscious that RFCs have failed in the past for being "not complete enough" as well as for being "too complex". Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath
On 10 April 2024 10:38:44 BST, Saki Takamachi wrote: >I was thinking about this today, and I think both are correct opinions on >whether to set the initial value to HALF_UP or TOWARD_ZERO. It's just a matter >of prioritizing whether consistency with existing behavior or consistency >within a class, and they can never be met simultaneously. Yes, I agree there's a dilemma there. The extra point in favour of TOWARD_ZERO is that it's more efficient, because we don't have to over-calculate and round, just pass scale directly to the implementation. Any other option makes for unnecessary extra calculation in code like this: $total = new Number('20'); $raw_frac = $total / 7; $rounded_frac = $raw_frac->round(2, Round::HALF_UP); If HALF_UP rounding is the implied default, we have to calculate with scale 11 giving 1.42857142857, round to 1.4285714286, then round again to 1.43. If truncation / TOWARD_ZERO is the implied default, we only calculate with scale 10 giving 1.4285714285 and then round once to 1.43. (Of course, in this example, the most efficient would be for the user to write $rounded_frac = $total->div(7, 2, Round::HALF_UP) but they might have reasons to keep the division and rounding separate.) Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath
On 10 April 2024 00:36:21 BST, Saki Takamachi wrote: >- The scale and rounding mode are not required for example in add, since the >scale of the result will never be infinite and we can automatically calculate >the scale needed to fit the result. Does adding those two options to all >calculations mean adding them to calculations like add as well? That's why I mentioned the two different groups of users. The scale and rounding mode aren't there for group (a), who just want the scale to be managed automatically; they are there for group (b), who want to guarantee a particular result has a particular scale. The result of $a->add($b, 2, Round::HALF_UP) will always be the same as $a->add($b)->round(Round::HALF_UP) but is more convenient, and in some cases more efficient, since it doesn't calculate unnecessary digits. Remember also the title and original aim of the RFC: add object support to BCMath. The scale parameter is already there on the existing functions (bcadd, bcmul, etc), so removing it on the object version would be surprising. The rounding mode is a new feature, but there doesn't seem a good reason not to include it everywhere as well. >- As Tim mentioned, it may be confusing to have an initial value separate from >the mode of the `round()` method. Would it make sense to have an initial value >of HALF_UP? Again, the aim was to match the functionality of the existing functions. It's likely that users will migrate code written using bcdiv() to use BCMath\Number->div() and expect it to work the same, at least when specifying a scale. Having it behave differently by rounding up the last digit by default seems like a bad idea. Thinking about the implementation, the truncation behaviour also makes sense: the library isn't actually rounding anything, it's calculating digit by digit, and stopping when it reaches the requested scale. The whole concept of rounding is something that we are adding, presumably by passing $scale+1 to the underlying library functions. It's a nice feature to add, but not one that should be on by default, given we're not writing the extension from scratch. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath
On 24/03/2024 13:13, Saki Takamachi wrote: https://wiki.php.net/rfc/support_object_type_in_bcmath Based on the various discussions we've been having, I'd like to propose a simplified handling of "scale". I think there are two groups of users we are trying to help: a) Users who want an "infinite" scale, and will round manually when absolutely necessary, e.g. for display. The scale can't actually be infinite in the case of calculations like 1/3, so they need some safe cut-off. b) Users who want to perform operations on a fixed scale, with configurable rounding, e.g. for e-commerce pricing. They are not interested in any larger scale, except possibly in some intermediate calculations, when they want the same as group (a). I propose: - The constructor accepts string|int $num only. - All operations accept an optional scale and rounding mode. - If no rounding mode is provided, the default behaviour is to truncate. This means that (new BCMath\Number('20'))->div(3, 5) has the same result as bcdiv('20', '3', 5) which is 6.6 - If a rounding mode is provided, the object transparently calculates one extra digit of scale, then rounds according to the specified mode. - If no scale is provided, most operations will automatically calculate the required scale, e.g. add will use the larger of the two scales. This is the same as the current RFC. - If no scale is provided to div(), sqrt(), or pow(-$x), the result will be calculated to the scale of the left-hand operand, plus 10. This is the default behaviour in the current RFC. - Operator overloads behave the same as not specifying a scale or rounding mode to the corresponding method. Therefore (new BCMath\Number('20')) / (new BCMath\Number('3')) will result in 6.66 - an automatic scale of 10, and truncation of further digits. Compared to the current RFC, that means: - Remove the ability to customise "max expansion scale". For most users, this is a technical detail which is more confusing than useful. Users in group (b) will never encounter it, because they will specify scale manually; advanced users in group (a) may want to customise the logic in different ways anyway. - Remove the ability for a Number value to carry around its own default rounding mode. Users in group (a) will never use it. Users in group (b) are likely to want the same rounding in the whole application, but providing it on every call to new Number() is no easier than providing it on each fixed-scale calculation. - Remove the $maxExpansionScale and $roundingMode properties and constructor parameters. - Remove withMaxExpansionScale and withRoundMode. - Remove all the logic around propagating rounding mode and expansion scale between objects. I've also noticed that the round method is currently defined as: - public function round(int $precision = 0, int $mode = PHP_ROUND_HALF_UP): Number {} Presumably $precision here is actually the desired scale of the result? If so, it should probably be named $scale, as in the rest of the interface. I realise it's called $precision in the global round() function; that's presumably a mistake which is now hard to fix due to named parameters. Ideally, it would be nice to have both roundToPrecision() and roundToScale(), but as Jordan explained, an implementation which actually calculated precision could be difficult and slow. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?
On 8 April 2024 21:51:46 BST, Jordan LeDoux wrote: >I have mentioned before that my understanding of the deeper aspects of how >zvals work is very lacking compared to some others, so this is very >helpful. My own knowledge definitely has gaps and errors, and comes mostly from introductions like https://www.phpinternalsbook.com/ and in this case Nikita's blog articles about the changes in 7.0: https://www.npopov.com/2015/05/05/Internal-value-representation-in-PHP-7-part-1.html > I confess that I do not >understand the technical intricacies of the interned strings and packed >arrays, I just understand that the zval structure for these arbitrary >precision values would probably be non-trivial, and from what I was able to >research and determine that was in part related to the 64bit zval limit. From previous discussions, I gather that the hardest part of implementing a new zval type is probably not the memory structure itself - that will mostly be handled in a few key functions and macros - but the sheer number of places that do something different with each zval type and will need updating. Searching for Z_TYPE_P, which is just one of the macros used for that purpose, shows over 200 lines to check: https://heap.space/search?project=php-src=Z_TYPE_P=c That's why it's so much easier to wrap a new type in an object, because then all of those code paths are considered for you, you just have a fixed set of handlers to implement. If Ilija's "data classes" proposal progresses, you'll be able to have copy-on-write for free as well. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?
, even making it into the "bundled" list doesn't mean it's installed by default everywhere, and userland libraries spend a lot of effort polyfilling things which would ideally be available by default. This is, essentially, the thesis of the research and work that I have done in the space since joining the internals mailing list. Thanks, there's some really useful perspective there. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?
On Mon, 8 Apr 2024, at 13:42, Arvids Godjuks wrote: > The ini setting I was considering would function similarly to what it does > for floats right now - I assume it changes the exponent, thereby increasing > their precision but reducing the integer range they can cover. If you're thinking of the "precision" setting, it doesn't do anything nearly that clever; it's purely about how many decimal digits should be *displayed* when converting a binary float value to a decimal string. In recent versions og PHP, it has a "-1" setting that automatically does the right thing in most cases. https://www.php.net/manual/en/ini.core.php#ini.precision The other way around - parsing a string to a float, including when compiling source code - has a lot of different compile-time options, presumably to optimise on different platforms; but no user options at all: https://github.com/php/php-src/blob/master/Zend/zend_strtod.c Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type
On 8 April 2024 10:12:31 BST, Saki Takamachi wrote: > >I don't see any point in "scalar types" that feel almost like objects, because >it just feels like you're manipulating objects with procedural functions. Why >not just use objects instead? Again, I don't think "has more than one attribute" is the same as "feel almost like objects". But we're just getting further away from the current discussion, I think. >Sorry, but I have no idea what you mean by "numbers have rounding modes". >Numbers are just numbers, and if there's something other than numbers in >there, then to me it's an object. The proposed class is called BCMath\Number, which implies that every instance of that class represents a number, just as every instance of a class called DateTime represents a date and time. In the end, a class is just a type definition. In pure OOP, it defines the type by its behaviour (methods / messages); in practice, it also defines the properties that each value of the type needs. So I am saying that if you were designing a class to represent numbers, you would start by saying "what properties does every number value have?" I don't think "rounding mode" would be on that list, so I don't think it belongs on a class called Number. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type
On 8 April 2024 01:34:45 BST, Saki Takamachi wrote: > > I'm making these opinions from an API design perspective. How the data is > held internally is irrelevant. zval has a lot of data other than values. What > I want to say is that if multiple types of parameters are required for > initialization, they may only serve as a substitute for object for the user. Again, that only seems related to objects because that's what you're used to in PHP, and even then you're overlooking an obvious exception: array(1, 2) If we ever do want to make decimals a native type, we would need some way to initialise a decimal value, since 1.2 will initialise a float. One of the most obvious options is a function-like syntax, decimal(1.2). If we do want numbers to carry extra information in each value, it will be no problem at all to support that. On the other side, just because something's easy doesn't mean it's the right solution. We could make an object which contained a number and an operation, and write this: $a = new NumberOp(42, 'add'); $b = $a->exec(15); $c = $b->withOperation('mul'); $d = $c->exec(2); I'm sure you'd agree that would be a bad design. So, again, I urge you to forget about it being easy to stick an extra property on an object, and think in the abstract: does it make sense to say "this number has a preferred rounding mode", rather than "this operation has a preferred rounding mode". Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?
On 07/04/2024 20:55, Jordan LeDoux wrote: I have been doing small bits of work, research, and investigation into an MPDec or MPFR implementation for years, and I'm likely to continue doing my research on that regardless of whatever is discussed in this thread. I absolutely encourage you to do that. What I'm hoping is that you can share some of what you already know now, so that while we're discussing BCMath\Number, we can think ahead a bit to what other similar APIs we might build in the future. The below seems to be exactly that. Yes. BCMath uses fixed-scale, all the other libraries use fixed-precision. That is, the other libraries use a fixed number of significant digits, while BCMath uses a fixed number of digits after the decimal point. That seems like a significant difference indeed, and one that is potentially far more important than whether we build an OO wrapper or a "scalar" one. So, for instance, it would not actually be possible without manual rounding in the PHP implementation to force exactly 2 decimal digits of accuracy in the result and no more with MPDec. The current BCMath proposal is to mostly choose the scale calculations automatically, and to give precise control of rounding. Neither of those are implemented in libbcmath, which requires an explicit scale, and simply truncates the result at that point. That's why I said that the proposal isn't really about "an OO wrapper for BCMath" any more, it's a fairly generic Number API, with libbcmath as the back-end which we currently have available. So thinking about what other back-ends we might build with the same or similar wrappers is useful and relevant. The idea of money, for instance, wanting exactly two digits would require the implementation to round, because something like 0.0013 has two digits of *precision*, which is what MPDec uses, but it has 8 digits of scale which is what BCMath uses. This brings us back to what the use cases are we're trying to cover with these wrappers. The example of fixed-scale money is not just a small niche that I happen to know about: brick/money has 16k stars on GitHub, and 18 million installs on Packagist; moneyphp/money has 4.5k stars and 45 million installs; one has implementations based on plain PHP, GMP, and BCMath; the other has a hard dependency on BCMath. Presumably, there are other use cases where working with precision rather than scale is essential, maybe just as popular (or that could be just as popular, if they could be implemented better). In which case, should we be designing a NumberInterface that provides both, with BCMath having a custom (and maybe slow) implementation for round-to-precision, and MPDec/MPFR having a custom (and maybe slow) implementation for round-to-scale? Or, should we abandon the idea of having one preferred number-handling API (whether that's NumberInterface or a core decimal type), because no implementation could handle both use cases? Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type
On 07/04/2024 14:27, Saki Takamachi wrote: If we really wanted decimal to be a native type, then the rounding mode and scale behavior should be completely fixed and not user selectable. If not, decimal should be implemented as a class. As I replied to Jordan, I don't see why this is connected to "scalar" vs "object" types at all. An object - particularly an immutable one - is just a way of declaring a type, and some syntax for operations on that type. There's really no difference at all between these: $half = $whole / 2; $half = numeric_div($whole, 2); $half = $whole->div(2); In PHP, right now, the last one is only available on objects, but there have been proposals in the past to change that; it's just syntax. For rounding, the first one is the real problem, because there's nowhere to put an extra operand. That problem is the same for a class with an overloaded "/" operator, and a "scalar" type which has a definition of "/" in the engine. Maybe it feels more obvious that an object can carry extra state in private properties, but internal types don't actually need private properties at all. PHP's "array" type has a bunch of different state without being an object (a linked list of items, a hashtable for random access, an iteration pointer, etc); and SimpleXMLElement and DOMNode are exposed in PHP as separate classes, but actually store state in the same C struct provided by libxml2. So I see it just as an API design decision: do we specify the rounding mode of numeric division a) on every operation; b) on every value; c) in a runtime setting (ini_set); d) in a lexically scoped setting (declare)? My vote is for (a), maybe with (d) as a fallback. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath
On 07/04/2024 18:09, Tim Düsterhus wrote: - I'm not sure if the priority for the rounding modes is sound. My gut feeling is that operations on numbers with different rounding modes should be disallowed or made explicit during the operation (much like the scale for a division), but I'm not an expert in designing numeric APIs, so I might be wrong here. Personally, I'm not a fan of setting the rounding mode and the "max expansion scale" on each instance, for the same reason I'm not keen on having the collation on each instance in Derick's Unicode string draft. I understand the temptation: specifying it for every operation makes code more verbose, particularly since it rules out use of $a / $b; while specifying it as a global or scoped option would make code harder to reason about. But I think carrying it around on the instance doesn't really solve either problem, and creates several new ones: - A program which wants all operations to use the same rounding system still has to specify the options every time it initialises a value, which is probably nearly as often as operating on them. - A program which wants different modes at different times will end up calling $foo->withRoundingMode(RoundingMode::HALF_UP)->div(2), which is more verbose and probably slower than $foo->div(2, RoundingMode::HALF_UP) - You can't look at a function accepting a Number as input and know what rounding mode it will operate in, unless it explicitly changes it. It would be easier to scan up to find a per-file / per-block declare() directive, than to trace the calling code to know the rounding mode of an instance. - A complex set of rules is invented to "prioritise" the options in operations like $a + $b. Or, that operation has to be forbidden unless the mode is consistent, at which point it might as well be a global setting. As a thought experiment for comparison, imagine if to sort an array numerically you had to write this: $array = array_set_sorting_mode($array, SORT_NUMERIC); $array = array_sort($array); Or worse, if you had to set it on each string: $array = array_map($array, fn($s) => $s->withSortingMode(SORT_NUMERIC)); $array = array_sort($array); Rather than (assuming we replaced the current by-reference sorts): $array = array_sort($array, SORT_NUMERIC); Because we're designing an object, attaching extra properties to it is easy, but I don't think it actually makes it easy to use. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?
On 7 April 2024 15:38:04 BST, Saki Takamachi wrote: >> In other words, looking at how the efforts overlap doesn't have to mean >> abandoning one of them, it can mean finding how one can benefit the other. > >I agree that the essence of the debate is as you say. >However, an argument must always reach a conclusion based on its purpose, and >combining two arguments with different purposes can make it unclear how to >reach a conclusion. Well, that's the original question: are they actually different purposes, from the point of view of a user? I just gave a concrete suggestion, which didn't involve "combining two arguments", it involved splitting them up into three projects which all complement each other. It feels like both you and Jordan feel the need to defend the work you've put in so far, which is a shame; as a neutral party, I want to benefit from *both* of your efforts. It really doesn't matter to me how many mailing list threads that requires, as long as there aren't two teams making conflicting designs for the same feature. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?
On 7 April 2024 11:44:22 BST, Saki Takamachi wrote: >I don't think the two threads can be combined because they have different >goals. If one side of the argument was, "How about to add BCMath?" then >perhaps we should merge the discussion. But BCMath already exists and the >agenda is to add an OOP API. > >In other words, one is about adding new features, and the other is about >improving existing features. While I appreciate that that was the original aim, a lot of the discussion at the moment isn't really about BCMath at all, it's about how to define a fixed-precision number type. For instance, how to specify precision and rounding for operations like division. I haven't seen anywhere in the discussion where the answer was "that's how it already works, and we're not adding new features". Is there anything in the proposal which would actually be different if it was based on a different library, and if not, should we be designing a NumberInterface which multiple extensions could implement? Then Jordan's search for a library with better performance could lead to new extensions implementing that interface, even if they have portability or licensing problems that make them awkward to bundle in core. Finally, there's the separate discussion about making a new "scalar type". As I said in a previous email, I'm not really sure what "scalar" means in this context, so maybe "integrating the type more directly into the language" is a better description? That includes memory/copying optimisation (potentially linked to Ilija's work on data classes), initialisation syntax (which could be a general feature), and accepting the type in existing functions (something frequently requested for custom array-like types). In other words, looking at how the efforts overlap doesn't have to mean abandoning one of them, it can mean finding how one can benefit the other. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?
On 7 April 2024 01:32:29 BST, Jordan LeDoux wrote: >Internals is just volunteers. The people working on BCMath are doing that >because they want to, the people working on scalar decimal stuff are doing >that because they want to, and there's no project planning to tell one >group to stop. That's not how internals works (to the extent it works). I kind of disagree. You're absolutely right the detailed effort is almost always put in by people working on things that interest them, and I want to make clear up front that I'm extremely grateful to the amount of effort people do volunteer, given how few are paid to work on any of this. However, the goal of the Internals community as a whole is to choose what changes to make to a language which is used by millions of people. That absolutely involves project planning, because there isn't a marketplace of PHP forks with different competing features, and once a feature is added it's very hard to remove it or change its design. If - and I stress I'm not saying this is true - IF these two features have such an overlap that we would only want to release one, then we shouldn't just accept whichever is ready first, we should choose which is the better solution overall. And if that was the case, why would we wait for a polished implementation of both, then tell one group of volunteers that all their hard work had been a waste of time? So I think the question is very valid: do these two features have distinct use cases, such that even if we had one, we would still want to spend time on the other? Or, should we decide a strategy for both groups to work together towards a single goal? That's not about "telling one group to stop", it's about working together for the benefit of both users and the people volunteering their effort, to whom I am extremely grateful. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath
On 06/04/2024 07:24, Saki Takamachi wrote: Take a look at the methods shown below: ``` protected static function resultPropertyRules(string $propertyName, mixed $value1, mixed $value2): mixed {} ``` This method determines which operand value to use in the result after calculation for a property that the user has defined by extending the class. While this is an intriguing idea, it only solves a narrow set of use cases. For instance: - The class might want different behaviour for different operations; e.g. Money(42, 'USD') + Money(42, 'USD') should give Money(84, 'USD'); but Money(42, 'USD') * Money(42, 'USD') should be an error. - Properties might need to interact with each other; e.g. Distance(2, 'metres') + Distance(2, 'feet') could result in Distance(2.6096, 'metres'); but if you calculate one property at a time, you'll end up with Distance(4, 'metres'), which is clearly wrong. The fundamental problem is that it ignores the OOP concept of encapsulation: how an object stores its internal state should not define its behaviour. Instead, the object should be able to directly define behaviour for the operations it supports. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath
On 5 April 2024 19:30:24 BST, Jordan LeDoux wrote: >A composed class >does not somehow prevent the accidental error of mixing currencies, it just >moves where that error would occur It does not prevent someone accidentally *attempting* to mix currencies, but it absolutely prevents that mistake leading to bogus values in the application, because the methods available can all detect it and throw an Error. If the class has a mixture of currency-aware methods, and methods / operator overloads inherited from Number, you can end up getting nonsensical results instead of an Error. >If you want an actual answer about how a Money class would actually work in >practice, it would likely be something like this: > >``` >// $val1 and $val2 are instances of the Money class with unknown currencies >$val1Currency = $val1->getCurrency(); >$val2Currency = $val2->getCurrency(); >$val1 = $val1->convertToCommonCurrency(); >$val2 = $val2->convertToCommonCurrency(); >// perform the necessary calculations using operators >$answer = $answer->convertToCurrency($userCurrency); >``` You have missed out the key section: how do you actually add the two numbers? The addition MUST enforce the precondition that its operands are in the same currency; any other behaviour is nonsensical. So the definition of that method must be on the Money class, not inherited from Number. If the add() method on Number is final, you'll need to define a new method $val1->addCurrency($val2). The existing add() method and operator overload will be inherited unchanged, but calling them won't just be useless, it will be dangerous, because they can give nonsensical results. That's what makes composition the better design in this case, because the Money class can expose only the methods that actually have useful and safe behaviour. The fact that the composed class can't add its own operator overloads is unfortunate; but allowing inheritance wouldn't solve that, because the inherited overloads are all wrong anyway. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type
On 04/04/2024 23:31, Jordan LeDoux wrote: Well, firstly most of the financial applications that I've worked in (I work for a firm that writes accounting software right now) do not calculate intermediate steps like this with fixed precision, or even have an interest in doing so. My background is in e-commerce (specifically, travel) rather than finance. In that context, it's common to have a single-step operation like "per_person_price equals total_price divided by number of passengers" where both per_person_price and total_price are going to be expressed with the same accuracy. The top two results for "money" on Packagist are https://www.moneyphp.org/ and https://github.com/brick/money both of which take this general model: the scale of values is fixed, and every operation that might produce fractions of that requires a rounding mode. Truly "fixed-precision" is not something that decimal should even try to be, in my opinion. The use cases where you CANNOT simply round the result at the end to fit your display output or your storage location are very minimal. In that case, why do we need to think about the scale or precision of a decimal at all? What would the syntax 1.234_d3 do differently from 1.234_d? I mean, what you are describing is how such OBJECTS are designed in other languages like Python, but not scalars. I don't see any connection at all between what I'm describing and objects. I'm talking about what operations make sense on a particular data type, regardless of how that data type is implemented. To be honest, I'm not really sure what "scalar" means in this context. In PHP, we call strings "scalars" because they're neither "arrays" nor "objects"; but none of those have definitions which are universal to other languages. This kind of precision restriction isn't something you would place on an individual value, it's something you would place on all calculations. That's why in Python this is done with a global runtime setting using `getContext().perc` and `getContext().rounding`. A particular value having a precision of X doesn't imply anything concrete about a calculation that uses that value necessarily. Global settings avoid needing extra parameters to each operation, but don't really work for the use case I'm describing: different currencies have different "natural" scale, e.g. Japanese Yen have a scale of 0 (no fractional Yen), Bitcoin has a scale of 8 (100 million satoshis in 1 bitcoin). A program dealing with multiple currencies will want to assign a different scale to different values. Maybe we're just not understanding each other. Are you opposed to the idea of doing this as a scalar? Not at all; my first examples used method syntax, because I was basing them on https://github.com/brick/money In my last e-mail, I also gave examples using normal function syntax. What I'm saying is that $x / 2 doesn't have a good answer if $x is a fixed-precision number which can't be divided by 2 without exceeding that precision. You need a third operand, the rounding mode, so you can't write it as a binary operator, and need some kind of function like decimal_div(decimal $dividend, int|decimal $divisor, RoundingMode $roundingMode). How you implement "decimal" doesn't change that at all. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)
On 03/04/2024 00:01, Ilija Tovilo wrote: Data classes are classes with a single additional > zend_class_entry.ce_flags flag. So unless customized, they behave as > classes. This way, we have the option to tweak any behavior we would > like, but we don't need to. > > Of course, this will still require an analysis of what behavior we > might want to tweak. Regardless of the implementation, there are a lot of interactions we will want to consider; and we will have to keep considering new ones as we add to the language. For instance, the Property Hooks RFC would probably have needed a section on "Interaction with Data Classes". On the other hand, maybe having two types of objects to consider each time is better than having to consider combinations of lots of small features. On a practical note, a few things I've already thought of to consider: - Can a data class have readonly properties (or be marked "readonly data class")? If so, how will they behave? - Can you explicitly use the "clone" keyword with an instance of a data class? Does it make any difference? - Tied into that: can you implement __clone(), and when will it be called? - If you implement __set(), will copy-on-write be triggered before it's called? - Can you implement __destruct()? Will it ever be called? Consider this example, which would > work with the current approach: > > $shapes[0]->position->zero!(); I find this concise example confusing, and I think there's a few things to unpack here... Firstly, there's putting a data object in an array: $numbers = [ new Number(42) ]; $cow = $numbers; $cow[0]->increment!(); assert($numbers !== $cow); This is fairly clearly equivalent to this: $numbers = [ 42 ]; $cow = $numbers; $cow[0]++; assert($numbers !== $cow); CoW is triggered on the array for both, because ++ and ->increment!() are both clearly modifications. Second, there's putting a data object into another data object: $shape = new Shape(new Position(42,42)); $cow = $shape; $cow->position->zero!(); assert($shape !== $cow); This is slightly less obvious, because it presumably depends on the definition of Shape. Assuming Position is a data class: - If Shape is a normal class, changing the value of $cow->position just happens in place, and the assertion fails - If Shape is a readonly class (or position is a readonly property on a normal class), changing the value of $cow->position shouldn't be allowed, so this will presumably give an error - If Shape is a data class, changing the value of $shape->position implies a "mutation" of $shape itself, so we get a separation before anything is modified, and the assertion passes Unlike in the array case, this behaviour can't be resolved until you know the run-time type of $shape. Now, back to your example: $shapes = [ new Shape(new Position(42,42)) ]; $cow = $shapes; $shapes[0]->position->zero!(); assert($cow !== $shapes); This combines the two, meaning that now we can't know whether to separate the array until we know (at run-time) whether Shape is a normal class or a data class. But once that is known, the whole of "->position->zero!()" is a modification to $shapes[0], so we need to separate $shapes. Without such a class-wide marker, you'll need to remember to add the special syntax exactly where applicable. $shapes![0]!->position!->zero(); The array access doesn't need any special marker, because there's no ambiguity. The ambiguous call is the reference to ->position: in your current proposal, this represents a modification *if Shape is a data class, and is itself being modified*. My suggestion (or really, thought experiment) was that it would represent a modification *if it has a ! in the call*. So if Shape is a readonly class: $shapes[0]->position->!zero(); // Error: attempting to modify readonly property Shape::$position $shapes[0]->!position->!zero(); // OK; an optimised version of: $shapes[0] = clone $shapes[0] with [ 'position' => (clone $shapes[0]->position with ['x'=>0,'y'=>0]) ]; If ->! is only allowed if the RHS is either a readonly property or a mutating method, then this can be reasoned about statically: it will either error, or cause a CoW separation of $shapes. It also allows classes to mix aspects of "data class" and "normal class" behaviour, which might or might not be a good idea. This is mostly just a thought experiment, but I am a bit concerned that code like this is going to be confusingly ambiguous: $item->shape->position->zero!(); What is going to be CoW cloned, and what is going to be modified in place? I can't actually know without knowing the definition behind both $item and $item->shape. It might even vary depending on input. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type
On 04/04/2024 02:29, Jordan LeDoux wrote: But when it comes to fixed-precision values, it should follow rules very similar to those we discussed in the BCMath thread: - Addition and subtraction should return a value that is the largest scale/precision of any operands in the calculation. - Division and multiplication should return a value that is the sum of the scale/precision of any operands + 2 or a default (perhaps configurable) value if the sum is small, to ensure that rounding occurs correctly. Near zero, floats have about 12-ish decimal digits of accuracy, and will return their full accuracy for example. I haven't followed the discussion in the other thread, but I'm not sure what the use case would be for a "fixed scale decimal" that followed those rules. As mentioned before, the use case I've encountered is money calculations, where what people want to fix is the smallest unit of account - e.g. €0.01 for practical currency, or €0.0001 for detailed accounting / trading. If I write $total = 1.03_d2; $perPerson = $total / 2; I want a result of 0.51_d2 or 0.52_d2 - that's why I specified a scale of 2 in the first place. If I want an accurate result of 0.515_d3, I would just specify 1.03_d, since the scale hasn't had any effect on the result. If I want a lossless split into [0.51_d2, 0.52_d2] I still need a function to exist somewhere, whether you spell that $total->split(2), or decimal_split($total, 2), etc. So it seems safer to also have $total->div(2, Round::DOWN) or decimal_div($total, 2, Round::DOWN) and have $total / 2 give an error. Possibly, it could only error if the result doesn't fit in the scale, so that this would be fine: $total = 11.00_d2; $perPerson = $total / 2; assert($perPerson === 5.50_d2) Or possibly, it would just be an error to perform division on a fixed scale decimal, but allowed on a variable-fixed scale decimal. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)
On 02/04/2024 01:17, Ilija Tovilo wrote: I'd like to introduce an idea I've played around with for a couple of weeks: Data classes, sometimes called structs in other languages (e.g. Swift and C#). Hi Ilija, I'm really interested to see how this develops. A couple of thoughts that immediately occurred to me... I'm not sure if you've considered it already, but mutating methods should probably be constrained to be void (or maybe "mutating" could occupy the return type slot). Otherwise, someone is bound to write this: $start = new Location('Here'); $end = $start->move!('There'); Expecting it to mean this: $start = new Location('Here'); $end = $start; $end->move!('There'); When it would actually mean this: $start = new Location('Here'); $start->move!('There'); $end = $start; I seem to remember when this was discussed before, the argument being made that separating value objects completely means you have to spend time deciding how they interact with every feature of the language. Does the copy-on-write optimisation actually require the entire class to be special, or could it be triggered by a mutating method on any object? To allow direct modification of properties as well, we could move the call-site marker slightly to a ->! operator: $foo->!mutate(); $foo->!bar = 42; The first would be the same as your current version: it would perform a CoW reference separation / clone, then call the method, which would require a "mutating" marker. The second would essentially be an optimised version of $foo = clone $foo with [ 'bar' => 42 ] During the method call or write operation, readonly properties would allow an additional write, as is the case in __clone and the "clone with" proposal. So a "pure" data object would simply be declared with the existing "readonly class" syntax. The main drawback I can see (outside of the implementation, which I can't comment on) is that we couldn't overload the === operator to use value semantics. In exchange, a lot of decisions would simply be made for us: they would just be objects, with all the same behaviour around inheritance, serialization, and so on. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Requiring GPG Commit Signing
On 02/04/2024 20:02, Ilija Tovilo wrote: But, does it matter? I'm not sure we look at some commits closer than others, based on its author. It's true that it might be easier to identify malicious commits if they all come from the same user, but it wouldn't prevent them. It's like the difference between stealing someone's credit card, and cloning the card of everyone who comes into the shop: in the first case, someone needs to check their credit card statements carefully; in the second, you'll have a hard job even working out who to contact. Similarly, if you discover a compromised key or signing account, you can look for uses of that key or account, which might be a tiny number from a non-core contributor; if you discover a compromised account pushing unsigned commits, you have to audit every commit in the repository. I agree it's not a complete solution, but no security measure is; it's always about reducing the attack surface or limiting the damage. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Requiring GPG Commit Signing
On 02/04/2024 18:27, Ilija Tovilo wrote: If your GitHub account is compromised, [...] the attacker may simply register their own gpg key in your account, with the commits appearing as verified. If your ssh key is compromised instead, and you use ssh to sign your commits, the attacker may sign their malicious commits with that same key they may use to push. The key point (pun not intended) is that git doesn't record who pushed a commit - pushing is just data synchronization, not part of the history. What it records is who "authored" the commit, and by default that's just plain text; so if somebody compromises an SSH key or access token authorised to your GitHub account, they can push commits "authored by" Derick, or Nikita, or Bill Gates, and there is no way to tell them apart from the real thing. In fact, you don't need to compromise anybody's key: you could socially engineer a situation where you have push access to the repository, or break the security in some other way. As I understand it, this is exactly what happened 3 years ago: someone gained direct write access to the git.php.net server, and added commits "authored by" Nikita and others to the history in the repository. If all commits are signed, a compromised key or account can only be used to sign commits with that specific identity: your GitHub account can't be used to sign commits as Derick or Nikita, only as you. The impact is limited to one identity, not the integrity of the entire repository. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Requiring GPG Commit Signing
On Tue, 2 Apr 2024, at 15:15, Derick Rethans wrote: > Hi, > > What do y'all think about requiring GPG signed commits for the php-src > repository? I actually thought this was already required since the github move (and the events that led to it) 3 years ago. It was certainly discussed: https://externals.io/message/113838#113840 and a user guide was created on the PHP wiki: https://wiki.php.net/vcs/commit-signing Feedback for the idea was generally positive, but maybe nobody got around to actually doing it. -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC] Invoke __callStatic when non-static public methods are called statically
On 29/03/2024 18:14, Robert Landers wrote: When generating proxies for existing types, you often need to share some state between the proxies. To do that, you put static methods/properties on the proxy class and hope to the PHP Gods that nobody will ever accidentally name something in their concrete class with the name you chose for things. To help with that, you create some kind of insane prefix. Separating static and non-static methods wouldn't solve this - the concrete class could equally add a static method with the same name but a different signature, and your generated proxy would fail to compile. In fact, exactly the same thing happens with instance methods in testing libraries: test doubles have a mixture of methods for configuring mock / spy behaviour, and methods mimicking or forwarding calls to the real interface / class. Those names could collide, and require awkward workarounds. In a statically typed language, a concrete class can have two methods with the same name, but different static types, e.g. when explicitly implementing interfaces. In a "duck typing" system like PHP's, that's much trickier, because a call to $foo->bar() doesn't have a natural way to choose which "bar" is meant. I'd much rather see static and non-static methods being able to have the same name Allowing this would lead to ambiguous calls, because as others have pointed out, :: doesn't always denote a static call. Consider this code: class Test { public function test() { echo 'instance test'; } public static function test() { echo 'static test'; } } class Test2 extends Test { public function runTest() { parent::test(); } } (new Test2)->runTest(); Currently, this can call either of the test() methods if you comment the other out: https://3v4l.org/5HlPE https://3v4l.org/LBALm If both are defined, which should it call? And if you wanted the other, how would you specify that? We would need some new syntax to remove the ambiguity. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC] Invoke __callStatic when non-static public methods are called statically
On 29/03/2024 02:39, 하늘아부지 wrote: I created a wiki for __callStatic related issues. Please see: https://wiki.php.net/rfc/complete_callstatc_magic Hi, Several times in the discussion you have said (in different words) "__callStatic is called for instance methods which are private or protected", but that is not how it is generally interpreted. If you are calling a method from outside the class, as far as you're concerned only public methods exist; private methods are, by definition, hidden implementation details. This is more obvious in languages with static typing, where if you have an instance of some interface, only the methods on that interface exist; the concrete object might actually have other methods, but you can't access them. That is what is meant by "inaccessible": __call and __callStatic are called for methods which, as seen from the current scope, *do not exist*. You could still argue that static context is like a different scope, or a different statically typed interface - as far as that context is concerned, only static methods exist. But that's also not a common interpretation, for (at least) two reasons: Firstly, there is no syntax in PHP which specifically marks a static call - Foo::bar() is used for both static calls, and for forwarding instance calls, most obviously in the case of parent::foo(). Secondly, until PHP 8, marking a method as static was optional; an error was only raised once you tried to access $this in a context where it wasn't defined. In PHP 4, this was correct code; in PHP 5 and 7, it raised diagnostics (first E_STRICT, later E_DEPRECATED) but still ran the method: class Foo { function bar() { echo 'Hello, World!'; } } Foo::bar(); I think that's part of the reason you're getting negative feedback: to you, the feature seems like an obvious extension, even a bug fix; but to others, it seems like a complete change to how static calls are interpreted. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Consider removing autogenerated files from tarballs
On 31/03/2024 14:53, Christian Schneider wrote: But my main question is: I fail to see the difference whether I plant my malicious code in configure, configure.ac or *.c: Someone has to review the changes and notice the problem. And we have to trust the RMs. What am I missing? As I understand it, the attack being discussed involved*code that was never committed to version control*. The bulk of the payload was committed in fake binary test artifacts, which are unlikely to be inspected but harmless by themselves; but the trigger to incorporate it into the binary was added*manually* in between the automated build and producing the signed release archive. So the theory is that if there's no human involved in that process, there is no way for a human to introduce a malicious change at that step. An exploit would need to be introduced somewhere in version controlled, human-readable, code; giving extra chances for it to be detected. On 30/03/2024 18:24, Jakub Zelenka wrote: Do you think it would be different if the change happened in the distributed source file instead? I mean you could still modify tarball of the distributed file (e.g. hide somewhere in configure.ac or in our case more easily in less visible files like various Makefile.frag and similar). The only thing that you get by using just VCS files is that people could hash the distributed content of the files and compare it with the hash of the VCS files but does anyone do this sort of verification? We already use a version control system built entirely on comparing hashes of source files. So given a signed tarball that claimed to match the content of a signed tag, any user can trivially check out the tag, expand the tarball, and run "git diff" to detect any anomalies. The question of who would do that in practice is a valid one, and something that I'm sure has been discussed elsewhere regarding reproducible binary builds. On 30/03/2024 15:35, Daniil Gentili wrote: Btw, I do not believe that "it would require end users to install autotools and bison in order to compile PHP from tarballs" is valid reason to delay the patching of a serious attack vector ASAP. As is always the case, there is a trade-off between security and convenience - in this case, distributing something that's usable without large amounts of extra tooling (including, for some generated files, a copy of PHP itself), vs distributing something that is 100% reviewable by humans. Ultimately, 99.999% of users are not going to compile their own copy of PHP from source; they are going to trust some chain of providers to take the source, perform all the necessary build steps, and produce a binary. Removing generated files from the tarballs doesn't eliminate that need for trust, it just shifts more of it to organisations like Debian and RedHat; and maybe that's a valid aim, because those organisations have more resources than us to build appropriate processes. Making things reproducible aims to attack the same problem from a different angle: rather than placing more trust in one part of the chain, it allows multiple parallel chains, which should all give the same result. If builds from different sources start showing unexplained differences, it can be flagged automatically. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV][RFC] grapheme cluster for str_split, grapheme_str_split function
On 26/03/2024 21:14, Casper Langemeijer wrote: If you need someone to help for the grapheme_ marketing team, let me know. I think a big part of the problem is that very few people dig into the complexities of text encoding, and so don't know that a "grapheme" is what they're looking for. Unicode documentation is, generally, very careful with its terminology - distinguishing between "code points", "code units" "graphemes" , "grapheme clusters", "glyphs", etc. Pretty much everyone else just says "character", and assumes that everyone knows what they mean. As a case in point, looking at the PHP manual pages for strlen, mb_strlen, and grapheme_strlen: Short summary: - strlen — Get string length - mb_strlen — Get string length - grapheme_strlen — Get string length in grapheme units Description: - Returns the length of the given string. - Gets the length of a string. - Get string length in grapheme units (not bytes or characters) The first two don't actually say what units they're measuring in. Maybe it's millimetres? ;) The last one uses the term "grapheme" without explaining what it means, and makes a contrast with "characters", which is confusing, as one of the definitions in the Unicode glossary [https://unicode.org/glossary/#grapheme] is: > What a user thinks of as a character. The mb_strlen documentation has a bit more explanation in its Return Values section: > Returns the number of characters in string string having character encoding encoding. A multi-byte character is counted as 1. For Unicode in particular, this is a poor description; it is completely missing the term "code point", which is what it actually counts. That's probably because ext/mbstring wasn't written with Unicode in mind, it was "developed to handle Japanese characters", back in 2001; and it still does support several pre-Unicode "multi-byte encodings". For a bit of nostalgia: http://web.archive.org/web/20010605075550/http://www.php.net/manual/en/ref.mbstring.php So... if you want to help make people more aware of the grapheme_* functions, one place to start would be editing the documentation for the various string, mbstring, and grapheme functions to use consistent terminology, and sign-post each other more clearly. http://doc.php.net/tutorial/ Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: Make $offset of substr_replace null by default
[Aside: please don't use "reply" when starting a new thread. Although GMail and its imitators frequently ignore it, a reply contains a header telling clients where to add it to an existing thread. I've pasted your full text into a new e-mail rather than replying, so it reliably shows as its own thread.] On 23/03/2024 22:58, mickmackusa wrote: > substr_replace() has the following signature: > > substr_replace( > array|string $string, > array|string $replace, > array|int $offset, > array|int|null $length = null > ): string|array > > Was it deliberate to not allow a null value as the third parameter? If permitted to amend this signature, I think it would be sensible to set null as the default value for $offset and adopt the same logic as the $length parameter. > > I have recently stumbled upon what I assume is a code smell in multiple SO scripts that use: > > $prefixed = preg_filter('/^/', 'prefix_', $array); > > It smells because regardless of the passed in array values' types, there will always be a starting position of each values which are coerced to strings. In other words, the destructive feature of preg_filter() is never utilized. > > This means that for this task, preg_filter() can be unconditionally replaced with preg_replace(). > > $prefixed = preg_replace('/^/', 'prefix_', $array); > > But wait, regex isn't even needed for this task. It can be coded more efficiently as: > > $prefixed = substr_replace($array, 'prefix_', 0, 0) > > Next, my mind shifted to suffixing/postfixing. By using $ in the pattern. > > $prefixed = preg_replace('/$/', 'prefix_', $array); > > However, there isn't a convenient way to append a string to each value using substr_replace() with the current signature. > > If the $offset parameter worked like the $length parameter, then the language would provide a native, non-regex tool for appending a static string to all array elements. > > $suffixed = substr_replace($array, '_suffix'); > > Finally, I wish to flag the observation that null values inside of an array are happily coerced to strings inside of the aforementioned functions, but null is not consumable if singularly passed in. > > Some examples for context: https://3v4l.org/ENVip > > I look forward to hearing feedback/concerns. Not being familiar with the variations supported by substr_replace, it took me a while to understand what was being proposed here. In case anyone else is similarly lost, a null $length is equivalent to strlen($string), meaning "replace to the end"; so a null $offset having the same meaning would give "append to the end". On its own, this would be pretty pointless: $foo = substr_replace('abc', 'xyz', null); // a long-winded way of writing $foo = 'abc' . 'xyz'; But the function also has built-in mapping over arrays, so it could be used to append the same string to multiple inputs: $foo = substr_replace(['hello', 'goodbye'], '!', null); Or append each entry from one list onto each entry in the other: $foo = substr_replace(['one', 'two'], [' - uno', ' - dos'], null); Demo: https://3v4l.org/6eEIG While I can see the logic, it would never occur to me to use any of the functions mentioned for this task, rather than using array_map and a regular concatenation: $foo = array_map(fn($string) => $string . '!', ['hello', 'goodbye']); $foo = array_map(fn($string, $suffix) => $string . $suffix, ['one', 'two'], [' - uno', ' - dos']); Which of course extends to more complex cases: $foo = array_map(fn($english, $spanish) => "'$english' en Español es '$spanish'", ['one', 'two'], ['uno', 'dos']); $foo = array_map(fn($english, $spanish, $german) => "$english - $spanish - $german", ['one', 'two'], ['uno', 'dos'], ['ein', 'zwei']); https://3v4l.org/d55kT So, I'm not opposed to the change, but its value seems marginal. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: AS assertions
On Fri, 22 Mar 2024, at 17:38, Claude Pache wrote: > >> Le 22 mars 2024 à 16:18, Rowan Tommins [IMSoP] a >> écrit : >> >> $optionalExpiryDateTime = $expiry as ?DateTimeInterface else >> some_other_function($expiry); >> assert($optionalExpiryDateTime is ?DateTimeInterface); // cannot fail, >> already asserted by the "as" > > I think that the `is` operator is all we need; the `as` operator adds syntax > complexity for little gain. Compare: > > $optionalExpiryDateTime = $expiry as ?DateTimeInterface else > some_other_function($expiry); > > vs > > $optionalExpiryDateTime = $expiry is ?DateTimeInterface ? $expiry : > some_other_function($expiry); I agree, it doesn't add much; and that's what the draft RFC Ilija linked to says as well. But the point of that particular example is that after the "is" version, you don't actually know the type of $optionalExpiryDateTime without looking up the return type of some_other_function() With the "as" version, you can see at a glance that after that line, $optionalExpiryDateTime is *guaranteed* to be DateTimeInterface or null, which I understood to be the intention of Robert's original proposal on this thread. -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: AS assertions
On Fri, 22 Mar 2024, at 12:58, Robert Landers wrote: > >> $optionalExpiryDateTime = $expiry as ?DateTimeInterface else new >> DateTimeImmutable($expiry); > I'm not sure I can grok what this does... > > $optionalExpiryDateTime = ($expiry === null || $expiry instanceof > DateTimeInterface) ? $expiry : new DateTimeImmutable($expiry) Trying to write it as a one-liner is going to make for ugly code - that's why I'd love to have a new way to write it! But yes, that's the right logic. With the "is" operator from the Pattern Matching draft, it would be: $optionalExpiryDateTime = ($expiry is ?DateTimeInterface) ? $expiry : new DateTimeImmutable($expiry); But with a clearer assertion that the variable will end up with the right type in all cases: $optionalExpiryDateTime = $expiry as ?DateTimeInterface else some_other_function($expiry); assert($optionalExpiryDateTime is ?DateTimeInterface); // cannot fail, already asserted by the "as" > Maybe? What would be the usefulness of this in real life code? I've > never written anything like it in my life. I already explained the scenario: the parameter is optional, so you want to preserve nulls; but if it *is* present, you want to make sure it's the correct type before proceeding. Another example: // some library function that only supports strings and nulls function bar(?string $name) { if ( $string !== null ) ... else ... } // a function you're writing that supports various alternative formats function foo(string|Stringable|int|null $name = null) { // we don't want to do anything special with nulls here, just pass them along // but we do want to convert other types to string, so that bar() doesn't reject them bar($name as ?string else (string)$name); } To put it another way, it's no different from any other union type: at some point, you will probably want to handle the different types separately, but at this point in the program, either type is fine. In this case, the types that are fine are DateTimeInterface and null; or in the example above, string and null. > $optionalExpiryDateTime = $expiry == null ? $expiry : $expiry as > DateTimeInterface ?? new DateTimeImmutable($expiry as string ?? "now") If you think that's "readable" then we might as well end the conversation here. If that was presented to me in a code review, I'd probably just write "WTF?!" I have no idea looking at that what type I can assume for $optionalExpiryDateTime after that line, which was surely the whole point of using "as" in the first place? Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: AS assertions
On Fri, 22 Mar 2024, at 10:05, Robert Landers wrote: > After asking an AI for some examples and usages, the most compatible > one would be C#'s. In actuality, I think it could be hugely simplified > if we simply return null instead of throwing. There'd be no special > case for |null, and it would move the decision making to the > programmer: > > $x = $a as int ?? throw new LogicException(); It might be relevant that C# has only recently introduced the concept of explicitly nullable reference types, with a complex migration process for existing code: https://learn.microsoft.com/en-us/dotnet/csharp/nullable-migration-strategies So in most C# code, there isn't actually a difference between "expect a DateTime" and "expect a DateTime or null" PHP, however, strictly separates those two, and always has; so this would be surprising: $x = $a as DateTime; assert($x instanceof DateTime); // will fail if $x has defaulted to null! That's why I suggested that with an explcit default, the default would be automatically asserted as matching the specified type: $x = $a as DateTime else 'No date given'; // TypeError: string given, DateTime expected $x = $a as DateTime|string else 'No date given'; // OK $x = $a as DateTime else null; // TypeError: null given, DateTime expected $x = $a as ?DateTime else null; // OK If the statement runs without error, $x is guaranteed to be of the type (or pattern) given to the "as" operator. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: AS assertions
On Fri, 22 Mar 2024, at 08:17, Jordi Boggiano wrote: > We perhaps could make sure that as does not throw if used with `??`, or that > `??` catches the type error and returns the right-hand expression instead: > So to do a nullable typecast you would do: > > $a as int|float ?? null > While this limits the impact to only expressions combining as with ?? it still has the same fundamental problem: you can't meaningfully use it with a nullable type. As a concrete example, imagine you have an optional $description parameter, and want to ensure any non-null values are converted to string, but keep null unchanged. At first sight, it looks like you could write this: $descString = $description as string|null ?? (string)$description; But this won't work - the ?? swallows the null and turns it into an empty string, which isn't what you wanted. You need some syntax that catches the TypeError, but preserves the null: $descString = $description as string|null else (string)$description; // or $descString = $description as string|null catch (string)$description; // or $descString = $description as string|null default (string)$description; I actually think there are quite a lot of scenarios where that idiom would be useful: $optionalExpiryDateTime = $expiry as ?DateTimeInterface else new DateTimeImmutable($expiry); $optionalUnixTimestamp = $time as ?int else strotime((string)$time); $optionalUnicodeName = $name as ?UnicodeString else new UnicodeString( $name ); etc And once you have that, you don't need anything special for the null case, it's just: $nameString = $name as ?string else null; Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: AS assertions
On 22 March 2024 00:04:27 GMT, Robert Landers wrote: >I think that is where we are getting confused: `null` is a value (or >at least, the absence of a value). The fact that the type system >allows it to be used as though its a type (along with true and false) >is interesting, but I think it is confusing the conversation. Every value needs to belong to some type: for instance, true and false belong to the type "boolean", as returned by the gettype() function. There is a value called null, and the type it belongs to is also called "null". Unlike some languages, PHP has no concept of a typed null reference - you can't have "a null DateTime"; you can only have the one universal null, of type null. The existence of "null" in type checks is therefore necessary if you want to allow every value to pass some type check. There isn't any other type that can include the value null because the type of null is always null. That's completely different from true and false, both of which are covered by a type check for "bool". They are special cases, which aren't consistent with anything else in the type system. The "false" check was added first, as a way to express clearly the common pattern in old standard library functions of returning false on error. Then "true" was added later, for consistency. Both are newer, and far more exotic, than "null". Disallowing true and false in some type checking contexts would be fine (although mostly they're pointless, rather than harmful). Disallowing or repurposing null would mean you have an incomplete type system, because there is no other type to match a null value against. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: AS assertions
On 21/03/2024 19:03, Robert Landers wrote: I suppose we are taking this from different viewpoints, yours appears to be more of a philosophical one, whereas mine is more of a practical one. My main concern is consistency; which is partly philosophical, but does have practical impact - the same syntax meaning the same thing in different contexts leads to less user confusion and fewer bugs. But I also think there are real use cases for "error on anything other than either Foo or null" separate from "give me a null for anything other than Foo". $x = $a as null; (or any other value, such as true|false) appears to have no practical purpose in this particular case. There's plenty of possible pieces of code that have no practical purpose, but that on its own isn't a good reason to make them do something different. "null" as a standalone type (rather than part of a union) is pretty much always pointless, and was forbidden until PHP 8.2. It's now allowed, partly because there are scenarios involving inheritance where it does actually make sense (e.g. narrowing a return type from Foo|null to null); and probably also because it's easier to allow it than forbid it. That's not really what we're talking about anyway, though; we're talking about nullable types, or null in a union type, which are much more frequently used. Further, reading "$x = $a as null", as a native English speaker, appears to be the same as "$x = null". Well, that's a potential problem with the choice of syntax: "$x = $a as int" could easily be mistaken for "cast $a as int", rather than "assert that $a is int". If you spell out "assert that $a is null", or "assert that $a is int|null", it becomes very surprising for 'hello' to do anything other than fail the assertion. As I mentioned in the beginning, I see this mostly being used when dealing with mixed types from built-in/library functions, where you have no idea what the actual type is, but when you write the code, you have a reasonable expectation of a set of types and you want to throw if it is unexpected. My argument is that you might have a set of expected types which includes null, *and* want to throw for other, unexpected, values. If "|null" is special-cased to mean "default to null", there's no way to do that. Right now, the best way to do that is to simply set a function signature and pass the mixed type to the function to have the engine do it for you And if you do that, then a value of 'hello' passed to a parameter of type int|null, will throw a TypeError, not give you a null. As I illustrated in my last e-mail, you can even (since PHP 8.2) have a parameter of type null, and get a TypeError for any other value. That may not be useful, but it's entirely logical. It makes more sense, from a practical programming point-of-view, to simply return the value given if none of the types match. This perhaps is a key part of our difference: when I see "int|bool|null", I don't see any "value given", just three built-in types: int, which has a range of values from PHP_INT_MIN to PHP_INT_MAX; bool, which has two possible values "true" and "false"; and null, which has a single possible value "null". So there are 2**64 + 2 + 1 possible values that meet the constraint, and nothing to specify that one of those is my preferred default if given something unexpected. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: AS assertions
On 21/03/2024 15:02, Robert Landers wrote: I don't think you are getting what I am saying. $a as int|float would be an int, float, or thrown exception. $a as int|float|null would be an int, float, or null. I get what you're saying, but I disagree that it's a good idea. If $a is 'hello', both of those statements should throw exactly the same error, for exactly the same reason - the input is not compatible with the type you have specified. Another way of thinking about is: $x = $a as null What do you expect $x to be? The same as $x inside this function: function foo(null $x) { var_dump($x); } foo($a); Which is null if $a is null, and a TypeError if $a is anything else: https://3v4l.org/5UR5A Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: AS assertions
On 20/03/2024 23:05, Robert Landers wrote: > In other > words, I can't think of a case where you'd actually want a Type|null > and you wouldn't have to check for null anyway. It's not about having to check for null; it's about being able to distinguish between "a null value, which was one of the expected types" and "a value of an unexpected type". That's a distinction which is made everywhere else in the language: parameter types, return types, property types, will all throw an error if you pass a Foo when a ?Bar was expected, they won't silently coerce it to null. > If you think about it, in this proposal, you could use it in a match: > > // $a is TypeA|TypeB|null > > match (true) { > $a as ?TypeA => 'a', > $a as ?TypeB => 'b', > $a === null => 'null', > } That won't work, because match performs a strict comparison, and the as expression won't return a boolean true. You would have to do this: match (true) { (bool)($a as ?TypeA) => 'a', (bool)($a as ?TypeB) => 'b', $a === null => 'null', } Or this: match (true) { ($a as ?TypeA) !== null => 'a', ($a as ?TypeB) !== null => 'b', $a === null => 'null', } Neither of which is particularly readable. What you're really looking for in that case is an "is" operator: match (true) { $a is TypeA => 'a', $a is TypeB => 'b', $a === null => 'null', } Which in the draft pattern matching RFC Ilija linked to can be abbreviated to: match ($a) is { TypeA => 'a', TypeB => 'b', null => 'null', } Of course, in simple cases, you can use "instanceof" in place of "is" already: match (true) { $a instanceof TypeA => 'a', $a instanceof TypeB => 'b', $a === null => 'null', } > Including `null` in that type > seems to be that you would get null if no other type matches, since > any variable can be `null`. > I can't think of any sense in which "any variable can be null" that is not true of any other type you might put in the union. We could interpret Foo|false as meaning "use false as the fallback"; or Foo|int as "use zero as the fallback"; but I don't think that would be sensible. In other words, the "or null on failure" part is an option to the "as" expression, it's not part of the type you're checking against. If we only wanted to support "null on failure", we could have a different keyword, like "?as": $bar = new Bar; $bar as ?Foo; // Error $bar ?as Foo; // null (as fallback) $null = null; $null as ?Foo; // null (because it's an accepted value) $null ?as Foo; // null (as fallback) A similar suggestion was made in a previous discussion around nullable casts - to distinguish between (?int)$foo as "cast to nullable int" and (int?)$foo as "cast to int, with null on error". Note however that combining ?as with ?? is not enough to support "chosen value on failure": $bar = new Bar; $bar ?as ?Foo ?? Foo::createDefault(); // creates default object $null = null; $null ?as ?Foo ?? Foo::createDefault(); // also creates default object, even though null is an expected value That's why my earlier suggestion was to specify the fallback explicitly: $bar = new Bar; $bar as ?Foo else null; // null $bar as ?Foo else Foo::createDefault(); // default object $null = null; $nulll as ?Foo else null; // null $null as ?Foo else Foo::createDefault(); // also null, because it's an accepted value, so the fallback is not evaluated Probably, it should then be an error if the fallback value doesn't meet the constraint: $bar = new Bar; $bar as Foo else null; // error: fallback value null is not of type Foo $bar as ?Foo else 42; // error: fallback value 42 is not of type ?Foo Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: AS assertions
On 20 March 2024 12:51:15 GMT, Robert Landers wrote: >Oh and there isn't any difference between: > >$x as ?Type > >or > >$x as Type|null I'm not sure if I've misunderstood your example, or you've misunderstood mine. I'm saying that this should be an error, because the value is neither an instance of Foo nor null: $a = 42; $b = $a as Foo|null; Your earlier example implies that would make $b equal null, which feels wrong to me, because it means it wouldn't match this: $a = 42; $b = $a as Foo|Bar; If we want a short-hand for "set to null on error" that should be separate from the syntax for a nullable type. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: AS assertions
On 19/03/2024 16:24, Robert Landers wrote: $x = $attributeReflection->newInstance() as ?MyAttribute; if ($x === null) // do something since the attribute isn't MyAttribute I think reusing nullability for this would be a mistake - ideally, the right-hand side should allow any type, so "$foo as ?Foo" should mean the same as "$foo as Foo|null". A better alternative might be to specify a default when the type didn't match: $x = $attributeReflection->newInstance() as ?MyAttribute else null; if ($x === null) // do something since the attribute isn't MyAttribute Which then also allows you to skip the if statement completely: $x = $attributeReflection->newInstance() as MyAttribute else MyAttribute::createDefault(); That then looks a lot like a limited-use version of syntax for catching an exception inline, which would be nice as a general feature (but I think maybe hard to implement?) $x = somethingThatThrows() catch $someDefaultValue; As well pattern matching, which Ilija mentioned, another adjacent feature is a richer set of casting operators. Currently, we can assert that something is an int; or we can force it to be an int; but we can't easily say "make this an int if safe, but throw otherwise" or "make this an int if safe, but substitute null/$someValue otherwise". I've been considering how we can improve that for a while, but not settled on a firm proposal - there's a lot of different versions we *could* support, so choosing a minimal set is hard. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type
On 18/03/2024 04:39, Alexander Pravdin wrote: I'm not in the context of the core team plans regarding "strict types". Could you share some details here? What is the current plan regarding it? To make strict types on by default eventually? Or something else? PHP doesn't really have a defined "core team". There are contributors who are particularly active at a given time (sometimes, but far from always, because someone is paying them), contributors who are particularly long-standing and respected, contributors who throw themselves into a pet project and make it happen, and so on. Partly as a consequence of this, it's often hard to pin down any long-term plan about anything, outside of what particular people would like to see. So Gina's opinion (it was suffixed "IMHO") that strict types was a mistake shouldn't be read as "we have a solid plan for what is going to replace strict_types which everyone is on board with". I think a reasonable number of people do share the sentiment that having two separate modes was a mistake; and neither mode is actually perfect. It's not about "making it on by default", it's about coming up with a unified behaviour that makes the setting redundant. All of which is something of a diversion from the topic at hand, which is this: How can we introduce the ability to write user code in default decimals and at the same time keep the old way of working as it was before, to not introduce any troubles into the existing code and not introduce performance issues? As a user, I would like to have a choice. I don't think choice is really what you want: if you were designing a language from scratch, I doubt you would say "let's give the user a choice of what type 1 / 10 returns". What it's actually about is *backwards compatibility*: what will happen to code that expects 1/10 to give a float, if it suddenly starts giving a decimal. For most cases, I think the rule can be as simple as "decimal in means decimal out". What's maybe not as obvious at first sight is that that can apply to operators as functions, and already does: 100 / 10 gives int(10), but 100.0 / 10 gives float(10.0), as do 100 / 10.0 and 100.0 / 10.0 By the same logic, decimal(1) / 10 can produce decimal(0.1) instead of float(0.1), and we don't need any fancy directives. Even better if we can introduce a shorter syntax for decimal literals, so that it becomes 1_d / 10 Where things get more complicated is with *fixed-precision* decimals, which is what is generally wanted for something like money. What is the correct result of decimal(1.03, precision: 2) / 2 - decimal(0.515, 3)? decimal(0.51, 2)? decimal (0.52, 2)? an error? And what about decimal(10) / 3? If you stick to functions / methods, this is slightly less of an issue, because you can have decimal(1.03, 2)->dividedBy(2, RoundingMode::DOWN) == decimal(0.51, 2); or decimal(1.03, 2)->split(2) == [ decimal(0.52, 2), decimal(0.51, 2) ] Example names taken directly from the brick/money package. At that point, backwards compatibility is less of an issue as well: make the new functions convenient to use, but distinct from the existing ones. In short, the best way of avoiding declare() directives is not to replace them with something else, but to choose a design where nobody feels the need for them. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
On 18/03/2024 00:04, Ilija Tovilo wrote: I realize this is somewhat inconsistent, but I believe it is reasonable. If you want to expose the underlying property by-reference, you need to jump through some additional hoops. I disagree with this reasoning, because I foresee plenty of cases where a virtual property is necessary anyway, so doesn't provide any additional hoop to jump through. But there's not much more to say on this point, so I guess we'll leave it there. Again, it depends on how you think about it. As you have argued, for a get-only property, the backing value should not be writable without an explicit `set;` declaration. You can interpret `set;` as an auto-generated hook, or as a marker that indicates that the backing value is accessible without a hook. Regardless of which of these views you start with, it still seems intuitive to me that accesses inside the get hook would bypass the normal rules and write to the raw value. Leaving aside the implementation, there are three things that can happen when you write to a property: a) the set hook is called b) the raw property is written to c) an error is thrown Inside the dynamic scope of a hook, the behaviour is always (b), and I don't see any reason for that to change. From anywhere else, backed properties currently try (a) and fall back to (b); virtual properties try (a) and fall back to (c). I do understand that falling back to (b) makes the implementation simpler, and works well with inheritance and some use cases; but falling back to (c) wouldn't necessarily need a "default hook", just a marker of "has hooks". It occurred to me you could implement it in reverse: auto-generate a hook "set => throw new Error;" and then *remove* it if the user opts in to the default set behaviour. That would keep the "write directly" case optimised "for free"; but it would be awkward for inheritance, as you'd have to somehow avoid calling the parent's hook. The meaning for `set;` is no longer clear. Does it mean that there's a generated hook that accesses the backing field? Does it mean that the backing field is accessible without a hook? Or does it mean that it accesses the parent hook? The truth is, with inheritance there's no way to look at the property declaration and fully understand what's going on, unless all hooks must be spelled out for the sake of clarity (e.g. `get => parent::$prop::get()`). Yes, I think this is probably a good argument against requiring "set;" I think "be careful when inheriting only one hook" will always be a key rule to teach anyway, because it's easy to mess up (e.g. assuming the parent is backed and accessing $this->foo, rather than calling the parent's hook implementation). But adding "set;" into the mix probably just makes it worse. I seriously doubt accessing the backing value outside of the current hook is useful. The backing value is an implementation detail. If it is absolutely needed, `ReflectionProperty::setRawValue()` offers a way to do it. I understand the desire for a shorter alternative like `$field`, but it doesn't seem like the majority shares this desire at this point in time. The example of clearAll() is a real use case, which people will currently achieve with __get and __set (e.g. the Yii ActiveRecord implementation I linked in one of my previous messages). The alternative wouldn't be reflection, it would just be switching to a virtual property with the value stored in a private field. I think that's fine, it's just drawing the line of which use cases backed properties cover: Kotlin covers more use cases than C#; PHP will cover more than Kotlin (methods able to by-pass a hook when called from that hook); but it will draw the line here. A different syntax like `$this->prop::raw` comes with similar complexity issues, similar to those previously discussed for `parent::$prop`/`parent::$prop = 'prop'`. Yeah, I can't even think of a nice syntax for it, let alone a nice implementation. Let's leave it as a thought experiment, no further action needed. :) Regarding asymmetric types: I can't speak for IDEs or static analyzers, but I'm not sure what makes this case special. We can ask some of their maintainers for feedback. In order to reliably tell the user whether "$a->foo = $b->bar;" is a type-safe operation, the analyser will need to track two types for every property, the "gettable type" and the "settable type", and apply them in the correct contexts. I've honestly no idea whether that will be easy or hard; it will probably vary between tools. In particular, I get the impression IDEs / editor plugins sometimes have a base implementation used for multiple programming languages, and PHP might be the only one that needed this extra tracking. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [Pre-RFC] Improve language coherence for the behaviour of offsets and containers
On 11/03/2024 12:52, Gina P. Banyard wrote: I would like to get some initial feedback on an RFC I've been working on for the last 5–6 months. The RFC attempts to explain, and most importantly, improve the semantics around $container[$offset] as PHP is currently widely inconsistent. [...] RFC: https://github.com/Girgias/php-rfcs/blob/master/container-offset-behaviour.md Hi Gina, I've just read through this thoroughly, and am simultaneously impressed with your investigation, and amazed at how many inconsistencies you found. I think the proposed granular interfaces absolutely make sense, given the different uses people have for such offsets. My only hesitation is that if you want "everything", things become quite verbose: class Foo implements DimensionFetchable, DimensionWritable, FetchAppendable, DimensionUnsettable { ... } function bar(DimensionFetchable $container) { ... } Unfortunately, I can't think of an easy solution to this without some form of type aliases. As an experiment, I tried writing a variation of Python's "defaultdict" [1] using all the new hooks (without actually testing it against any implementation). Here's what I came up with: https://gist.github.com/IMSoP/fbd60c5379ccefcab6c5af25eacc259b Most of it is straight-forward, but a couple of things stood out: * Separating offsetFetch from offsetGet is really useful, because we can avoid "auto-vivifying" a key that's only been read, never updated. In other words, isset($foo['a']) can remain false after running var_dump($foo['a']), but $foo['a']++ should still work. * The fetchAppend hook is quite confusing to implement, because it's used in a few subtly different scenarios. For instance, if it's actually $container[][$offset] = $value there is an implicit requirement that fetchAppend should return array|DimensionWritable, but presumably that has to be enforced after fetchAppend has returned. I'm not sure if there's anything that can be improved here; it probably just needs some examples in the user manual. [1] https://docs.python.org/3/library/collections.html#collections.defaultdict Over all, I think this is a really great proposal, and hope it proceeds smoothly. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
On 17/03/2024 00:01, Ilija Tovilo wrote: For clarity, you are asking for a way to make the "virtualness" of properties more explicit, correct? Either more explicit, or less important: the less often the user needs to know whether a property is virtual, the less it matters how easily they can find out. Please let me know if you are aware of any other potentially non-intuitive cases. I agree that while they may not be immediately obvious to the user, most of the distinctions do make sense once you think about them. The remaining difference I can see in the current RFC which seems to be unnecessary is that combining with set is only allowed on virtual properties. Although it may be "virtual" in the strict sense, any hook must actually be referring to some value stored somewhere - that might be a backed property, another field on the current class, a property of some other object, etc: public int $foo { => $this->foo; set { $this->foo = $value; } } public int $bar { => $this->_bar; set { $this->_bar = $value; } } public int $baz { => $this->delegatedObj->baz; set { $this->delegatedObj->baz = $value; } } This sentence from the RFC applies equally to all three of these examples: > That is because any attempted modification of the value by reference would bypass a |set| hook, if one is defined. I suggest that we either trust the user to understand that that will happen, and allow combining and set on any property; or we do not trust them, and forbid it on any property. Apart from the things already mentioned, it's unclear to me whether, with such `set;` declarations, a `get`-only backed property should even be legal. With the complete absence of a write operation, the assignment within the `set` itself would fail. To make this work, the absence of `set;` would need to mean something like "writable, but only within another hook", which introduces yet another form of asymmetric visibility. Any write inside the get hook already by-passes the set hook and refers to the underlying property, so there would be no need for any default set behaviour other than throwing an error. It's not likely to be a common scenario, but the below works with the current implementation https://3v4l.org/t7qhR/rfc#vrfc.property-hooks class Example { public int $nextNumber { get { $this->nextNumber ??= 0; return $this->nextNumber++; } // Mimic the current behaviour of a virtual property: https://3v4l.org/cAfAI/rfc#vrfc.property-hooks set => throw new Error('Property Example::$nextNumber is read-only'); } } Fair enough. 1 and 2 are reasons why we added the `$field` macro as an alternative syntax in the original draft. I don't quite understand point 3. In Kotlin, `field` is only usable within its associated hook. Other languages I'm aware of do not provide a way to access the backing value directly, neither inside nor outside the accessor. We are already allowing more than Kotlin by letting hooks call out to a method, and have that method refer back to the raw value. Hypothetically, we could allow *any* method to access it, using some syntax like $this->foo::raw. As a spectrum from least access to most access: 1) $field - accessible only in the lexical scope of the hook 2) $this->foo - accessible in the dynamic scope of the hook, e.g. a hook calling $this->doSomething(__PROPERTY__); 3) $this->foo::raw - accessible anywhere in the class, e.g. a public clearAll() method by-passing hooks Whichever we provide for backed properties, option 3 is available for virtual properties anyway, and common with __get/__set: store a value in a private property, and have a public hooked property providing access to it. I understand now that option 2 fits most easily with the implementation, and with decisions around inheritance and upgrade of existing code; but the other options do have their advantages from a user's point of view. I personally do not feel strongly about whether asymmetric types make it into the initial implementation. Larry does, however, and I think it is not fair to exclude them without providing any concrete reasons not to. I will spend time in the following days cleaning up tests, and I will try my best to try to break asymmetric types. If I (or anybody else) can't find a way to do so, I don't see a reason to remove them. My concern is more about the external impact of what is effectively a change to the type system of the language: will IDEs give correct feedback to users about which assignments are legal? will tools like PhpStan and Psalm require complex changes to analyse code using such properties? will we be prevented from adding some optimisation to OpCache because these properties break some otherwise safe assumption? Maybe I'm being over-cautious, but those are the kinds of que
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
On 16/03/2024 17:51, Ilija Tovilo wrote: Properties can inherit both storage and hooks from their parent. Hopefully, that helps with the mental model. Of course, in reality it is a bit more complicated due to guards and references. That is a really helpful explanation, thanks; I hadn't thought about the significance of inheritance between hooked and non-hooked properties. I still think there will be a lot of users coming from other languages, or from using __get and __set, who will look at virtual properties first. Making things less surprising for those people seems worth some effort, but I'm not asking for a complete redesign. Dynamic properties are not particularly relevant today. The point was not to show how similar these two cases are, but to explain that there's an existing mechanism in place that works very well for hooks. We may invent some new mechanism to access the backing value, like `field = 'value'`, but for what reason? This would only make sense if the syntax we use is useful for something else. However, given that without guards it just leads to recursion, which I really can't see any use for, I don't see the point. I can think of several reasons we *could* explore other syntax: 1) To make it clearer in code whether a particular line is accessing via the hooks, or by-passing them 2) To make the code in the hooks shorter (e.g. `$field` is significantly shorter than `$this->someDescriptiveName`) 3) To allow code to by-pass the hooks at will, rather than only when called from the hooks (e.g. having a single method that resets the state of several lazy-loaded properties) Those reasons are probably not enough to rule out the current syntax; but they show there are trade-offs being made. To be honest, my biggest hesitation with the RFC remains asymmetric types (the ability to specify types in the set hook). It's quite a significant feature, with no precedent I know of, and I'm worried we'll overlook something by including it immediately. For instance, what will be the impact on people using reflection or static analysis to reason about types? I would personally be more comfortable leaving that to a follow-up RFC to consider the details more carefully. Nobody else has raised that, beyond the syntax; I'm not sure if that's because everyone is happy with it, or because the significance has been overlooked. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
On 16 March 2024 00:19:57 GMT, Larry Garfield wrote: >Well, reading/writing from within a set/get hook is an obvious use case to >support. We cannot do cached properties easily otherwise: > >public string $expensive { > get => $this->expensive ??= $this->compute(); > set { >if (strlen($value) < 50) throw new Exception(); >$this->expensive = $value; > } >} To play devil's advocate, in an implementation with only virtual properties, this is still perfectly possible, just one declaration longer: private string $_expensive; public string $expensive { get => $this->_expensive ??= $this->compute(); set { if (strlen($value) < 50) throw new Exception(); $this->_expensive = $value; } } Note that in this version there is an unambiguous way to refer to the raw value from anywhere else in the class, if you wanted a clearAll() method for instance. I can't stress enough that this is where a lot of my thinking comes from: that backed properties are really the special case, not the default. Anything you can do with a backed property you can do with a virtual one, but the opposite will never be true. The minimum version of backed properties is basically just sugar for that - the property is still essentially virtual, but the language declares the backing property for you, leading to: public string $expensive { get => $field ??= $this->compute(); set { if (strlen($value) < 50) throw new Exception(); $field = $value; } } I realise now that this isn't actually how the current implementation works, but again I wanted to illustrate where I'm coming from: that backed properties are just a convenience, not a different type of property with its own rules. > Being the same also makes the language more predictable, which is also a > design goal for this RFC. (Hence why "this is the same logic as > methods/__get/other very similar thing" is mentioned several times in the > RFC. Consistency in expectations is generally a good thing.) I can only speak for myself, but my expectations were based on: a) How __get and __set are used in practice. That generally involves reading and writing a private property, of either the same or different name from the public one; and that private property is visible everywhere equally, no special handling based on the call stack. b) What happens if you accidentally cause infinite recursion in a normal function or method, which is that the language eventually hits a stack depth limit and throws an error. So the assertion that the proposal was consistent with expectations surprised me. It feels to me like something that will seem surprising to people when they first encounter it, but useful once they understand the implications. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
On 15 March 2024 17:11:29 GMT, Larry Garfield wrote: >On Wed, Mar 13, 2024, at 10:26 PM, Rowan Tommins [IMSoP] wrote: >> I think it would be more helpful to justify this design on its own >> merits, particularly because it's a significant difference from other >> languages (which either don't have a "real property" behind the hooks, >> or in Kotlin's case allow access to it only *directly* inside the hook >> definitions, via the "field" keyword). > >I'm not sure I follow. The behavior we have currently is very close to how >Kotlin works, from a user perspective. Unless I'm misunderstanding something, the backing field in Kotlin is accessible only inside the hooks, nowhere else. I don't know what would happen if a hook caused a recursive call to itself, but there's no mention in the docs of it bypassing the hooks, only this: > This backing field can be referenced in the accessors using the `field` > identifier and > The `field` identifier can only be used in the accessors of the property. And then a section explaining that more complex hooks should use a separate backing property - which is the only option in C#, and roughly what people would do in PHP today with __get and __set. Kotlin does have a special syntax for "delegating" hooks, but looking at the examples, they do not use the backing field at all, they have to provide their own storage. >I've lost track of which specific issue you have an issue with or would want >changed. The guards to prevent an infinite loop are necessary, for the same >reasons as they are necessary for __get/__set. I understand that *something* needs to happen if a recursive call happens, but it could just be an error, like any other unbounded recursion. I can also understand the temptation to make it something more useful than an error, and provide a way to access the "backing field" / "raw value" from outside the hook. But it does lead to something quite surprising: the same line of code does different things depending on how it is called. I doubt many people have ever discovered that __get and __set work that way, since as far as I can see it's only possible to use deliberately if you're dynamically adding and unsetting properties inside your class. So, I don't necessarily think hooks working that way is the wrong decision, I just think it's a decision we should make consciously, not one that's obvious. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
On 12/03/2024 22:43, Larry Garfield wrote: It's slightly different, yes. The point is that the special behavior of a hook is disabled if you are within the call stack of a hook, just like the special behavior of __get/__set is disabled if you are within the call stack of __get/__set. What happens when you hit an operation that would otherwise go into an infinite loop is a bit different, but the "disable to avoid an infinite loop" logic is the same. I guess I'm looking at it more from the user's point of view: it's very rare with __get and __set to have a method that sometimes accesses the "real" property, and sometimes goes through the "hook". Either there is no real property, or the property has private/protected scope, so any method on the classes sees the "real" property *regardless* of access via the hook. I think it would be more helpful to justify this design on its own merits, particularly because it's a significant difference from other languages (which either don't have a "real property" behind the hooks, or in Kotlin's case allow access to it only *directly* inside the hook definitions, via the "field" keyword). The point is to give the user the option for full backwards compatibility when it makes sense. This requires jumping through some hoops, which is the point. This is essentially equivalent to creating a by-ref getter + a setter, exposing the underlying property. By creating a virtual property, we are "accepting" that the two are detached. While we could disallow this, we recognize that there may be valid use-cases that we'd like to enable. It also parallels __get/__set, where using &__get means you can write to something without going through __set. I get the impression that to you, it's a given that a "virtual property" is something clearly distinct from a "property with hooks", and that users will consciously decide between one and the other. This isn't my expectation; based on what people are used to from existing features, and other languages, I expect users to see this as an obvious starting point for defining a hooked property: private int $_foo; public int $foo { get => $this->_foo; set { $this->_foo = $value; } { And this as a convenient short-hand for exactly the same thing: public int $foo { get => $this->foo; set { $this->foo = $value; } } Choosing one or the other won't feel like "jumping through a hoop", and the ability to use an hook with one and not the other will simply seem like a weird oddity. In practice I expect it virtual properties with both hooks to be very rare. Most virtual properties will, I expect, be lazy-computed get-only values. I don't think this is true. Both of these are, in the terms of the RFC, "virtual properties": public Something $proxied { get => $this->otherObject->thing; set { $this->otherObject->thing = $value; } }; public Money $price; public int $pricePence { get => $this->price->asPence(); set { $this->price = Money::fromPence($value); } } I can also imagine generated classes with "virtual" properties which call out to generic "getCached" and "setAndClearCache" methods doing the job of this pair of __get and __set methods: https://github.com/yiisoft/yii2/blob/master/framework/db/BaseActiveRecord.php#L274 With the change to allow in the absence of set, I believe that would already work. cf:https://3v4l.org/3Gnti/rfc#vrfc.property-hooks Awesome! The RFC should probably highlight this, as it gives a significant extra option for array properties. (Also, good to know 3v4l has a copy of the branch; I hadn't thought to check.) Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
On 08/03/2024 15:53, Larry Garfield wrote: Hi folks. Based on earlier discussions, we've made a number of changes to the RFC that should address some of the concerns people raised. We also had some very fruitful discussions off-list with several developers from the Foundation, which led to what we feel are some solid improvements. https://wiki.php.net/rfc/property-hooks Hi Larry, Thanks again for the continuing hard work on this! > if a |get| hook for property |$foo| calls method |bar()|, then inside that method |$this->foo| will refer to the raw property, both read and write. If |bar()| is called from somewhere other than the hook, reading from |$this->foo| will trigger the |get| hook. This behavior is identical to that already used by |__get| and |__set| today. I'm slightly confused by this. If there is an actual property called $foo, then __get and __set will be called only when it is out of visibility, regardless of the call stack - e.g. a private property will always trigger __get from public scope, and always access it directly from private scope: https://3v4l.org/R5Yos That seems differ from what's proposed, where even a private call to bar() would trigger the hook. The protection against recursion appears to only be relevant for completely undefined properties. For __get, the direct access can never do anything useful - there's nothing to access: https://3v4l.org/2nDZS For __set, it is at least possible for the non-recursive write to succeed, but only in the niche case of creating a dynamic property: https://3v4l.org/dpYOj I'm not sure that there's any equivalent to this scenario for property hooks, since they can never be undefined/dynamic. > There is one exception to the above: if a property is virtual, then there is no presumed connection between the get and set operations. [...] For that reason, || by reference is allowed for virtual properties, regardless of whether or not there is a |set| hook. I don't agree with this, and the example immediately following it demonstrates the exact opposite: the and set hooks are both proxying to the same backing value, and have all the same problems as if the property was non-virtual. I would imagine a lot of real-life virtual properties would be doing something similar: converting to/from a different type, proxying to another object, etc. I think this exception is unnecessarily complicated: either trust users to handle the implications of combining with set, or forbid it. > Additionally, || hooks are allowed for arrays as well, provided there is no |set| hook. I mentioned in a previous e-mail the possibility of using the hook for array writes. Has this been considered? That is: $c->arr['beep'] = 'boop'; Would be equivalent to: $temp =& $c->arr; $temp['beep'] = 'boop'; unset($temp); Which would be valid if $arr had an hook defined. > A |set| hook on a typed property must declare a parameter type that is the same as or contravariant (wider) from the type of the property. > Once a property has both a |get| and |set| operation, however, it is no longer covariant or contravariant for further extension. How do these two rules interact? Could this: public string $foo { get => $this->_foo; set(string|Stringable $value) { $this->_foo = (string)$value; } } be over-ridden by this, where the property's "main type" remains invariant but its "settable type" is contravariant? public string $foo { get => $this->_foo; set(string|Stringable|SomethingElse $value) { $this->_foo = $value instanceof SomethingElse ? $value->asString() : (string)$value; } } > ReflectionProperty has several new methods to work with hooks. There should be some way to reliably determine the "settable type" of a property. At the moment, I think you would have to do something like this: $setHook = $property->getHook(PropertyHookType::Set); $writeType = $setHook === null ? $property->getType() : $setHook->getParameters()[0]->getType(); Once again, I would like to make the case that asymmetric types are an unnecessary complication that should be left to Future Scope. The fact that none of the other languages referenced have such a feature should also give us pause. There's nothing to stop us being the first to innovate a feature, but we should be extra cautious when doing so, with no previous experience to learn from. It also means there is no expectation from users coming from other languages that this will be possible. If it genuinely seems useful, it can be added in a follow-up RFC, or even a later version of PHP, with little impact on the rest of the feature. But if we add it now and regret it, or some detail of its implementation, we will be stuck with it forever. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
ed (or, it seemed to me) that $field was just an alias for referencing the "real" property. That's a really tempting interpretation, but it's not what's happening. What's really happening is that the property itself is virtual: every single access to it goes through the hooks. But, within the hooks, we have provided a magic variable, stored on the object but accessible only there, where the hooks can store a value of the same type as the virtual property. Once I came to that interpretation, it became much more intuitive to call that magic variable by a magic name like $field; than to re-use the syntax that would normally refer to the property, and make it sometimes reference this new thing instead. To re-iterate an earlier point, though, I think the language should choose. There should be exactly one way to refer to the backing field, whether that's $this->foo, $field, or get_backing_field(). Don't leave users reading each other's code and not being sure if it's doing the same thing. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
On 27/02/2024 17:49, Erick de Azevedo Lima wrote: > It sounds like most people are just really, really pissed off by an implicit variable I think that it could be good to follow the PHP way to mark the "magic" stuff, which is putting leading underscores on the magic stuff. I think that might help; I also think that even if the RFC offers a choice to the list, the final implementation should not offer choice to users. I think part of what put people off with the original wording was that it implied $field was an alias for $this->propertyName, but the alias was "preferred". The reality is that we have a new thing that we need a name/syntax for, and $field or $this->propertyName are possible options. To avoid another lengthy e-mail, I've put together some alternative RFC wording. The main idea is to switch the framing from "hooks on top of properties, which may be virtual" to "hooked properties which are virtual by default, but may access a special backing field". As noted in the introduction this is *not* intended as a counter-proposal or critique, just somewhere to collate my thoughts and suggestions: https://wiki.php.net/rfc/property-hooks/imsop-suggestion Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] What's up with emails?
This is a test e-mail from a subscribed GMail address, to see if the "451: Temporary lookup failure" errors are now resolved. Thanks to those working on it! -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
On 26 February 2024 23:11:16 GMT, Frederik Bosch wrote: >And what happens in the following situation, how are multiple get calls >working together? > >public string $fullName { > get => $this->first . ' ' . $this->last; // is this accessing the backed >value, or is it accessing via get > set($value) => $this->fullName = $value; >} > >public string $first { > get => explode(' ', $this->fullName)[0], // is this accessing the backed >value, or is it accessing via get > set($value) => $value; >} I don't think it's *that* confusing - the rule is not "hooks vs methods", it's "special access inside the property's own hook". But as I say, I'm coming around to the idea that using a different name for that "backing field" / "raw value" might be sensible. >> What would happen if a setter contained both "return 42;" and "return;"? The >> latter is explicitly allowed in "void" functions, but is also allowed in a >> non-void function as meaning "return null;" >return 42; // returns (int)42 >return; // early return, void, same as no return >return null; // returns null I'm not sure if you misunderstood my question, or just the context of why I asked it. I'm talking about a hook like this: set($value) { if ($value) { return 42; } else { return; } } Currently, the only definition of "void" in the language is that a void function must not contain an explicit return value. We could turn that check around, and deduce that a certain hook is void. This hook would not pass that check, so we would compile it to have an assignment, and the false case would assign null to the property. To avoid that, we would need some additional analysis to prove that in all possible paths, a return statement with a value is reached. The alternative would be to run the code, and somehow observe that it "returned void". But "void" isn't a value we can represent at run-time; we would need to set the return value to some special value just for this specific case. We would have to turn that on just for hook bodies, as returning it from normal functions would be a huge BC break, and also not very useful - with union types, there would be plenty of better options for a function to indicate a return value that needs special handling. >$generator = setCall($class, 'first', $value); >foreach ($generator as $value) { > writeProperty($class, 'first', $value); >} >if ($generator->hasReturn()) { >writeProperty($class, 'first', $generator->getReturn()); >} That's already an order of magnitude more complicated than "the return value is used on the right-hand side of an assignment", and it's missing at least one case: set($value) { return $value; } will not compile to a generator, so needs to skip and assign the value directly. By "magic", what I meant was "hidden logic underneath that makes it work". Assign-by-return has a small amount of magic - you can express it in half a line of code; assign-by-yield has much more magic - a whole bunch of loops and conditionals to operate your coroutine. > The yield is much more intuitive than magic fields I think we'll just have to differ in opinion on that one. Maybe you're just more used to working with coroutines than I am. Note that yield also doesn't solve how to read the current backing value in a get hook (or a set hook that wants to compare before and after), so we still need some way to refer to it. Regards, Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
On 26/02/2024 20:21, Frederik Bosch wrote: I do note that $this->propName might suggest that the backing value is accessible from other locations than only the property's own get/set methods, because of $this usage. Yes, I actually stumbled over that confusion when I was writing some of the examples in my lengthy e-mail in this thread. As I understand it, this would work: public string $foo { get { $this->foo ??= 0; $this->foo++; return $this->foo; } set { throw new Exception; } } Outside the hooks, trying to write to $this->foo would throw the exception, because it refers to the hooked property as a whole; but inside, the same name refers to something different, which isn't accessible anywhere else. Now that I've looked more at how Kotlin uses "field", I understand why it makes sense - it's not an alias for the property itself, but the way to access a "backing store" which has no other name. Using $this->foo as the name is tempting if you think of hooks as happening "on top of" the "real" property; but that would be a different feature, like Switft's "property observers" (willSet and didSet). What's really happening is that we're declaring two things at once, and giving them the same name; almost as if we'd written this: public string $foo { get { static $_foo; $_foo ??= 0; $_foo++; return $_foo; } set { throw new Exception; } } Kotlin's "field" is kind of the equivalent of that "static $_foo" Regarding returning void=null, this is something that IDE and static analyzers already pick-up as an error. I think being stricter on that in this RFC would actually make sense, and treat void not as null. What would happen if a setter contained both "return 42;" and "return;"? The latter is explicitly allowed in "void" functions, but is also allowed in a non-void function as meaning "return null;" And why yield is magic, I do not get that. The word and the expression actually expresses that something is, well, yielded. But yielded to where? My mental model of "return to set" is that this: public string $name { set($value) { $x = something($value); return $x + 1; } } Is effectively: private function _name_set($value) { $x = something($value); return $x + 1; } } plus: $this->name = $this->_name_set($value); With "yield", I can't picture that simple translation; the "magic" is whatever translates the "yield" keyword into "$this->name =" I would file it with the type widening in the RFC: seems kind of cool, but probably isn't worth the added complexity. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
On 26/02/2024 19:02, Frederik Bosch wrote: That's how it always has been, no? So in your example, short code abbreviated form would not work. One has to write a block. public string$fullName { set=> [$this->first, $this->last] = explode <http://www.php.net/explode>(' ', \ucfirst <http://www.php.net/ucfirst>($value)); // error, $fullName is a string, returning array } public string$fullName { set{ [$this->first, $this->last] = explode <http://www.php.net/explode>(' ', \ucfirst <http://www.php.net/ucfirst>($value)); // no error, not returning } } I think the intention is that both the block and the arrow syntax would have any return value ignored, as happens with constructors, for example. Note that in PHP, there is actually no such thing as "a function not returning a value", even a "void" function actually returns null; so if the return value was treated as meaningful, your second example would give an error "cannot assign null to property of type string". However, as noted in a previous message, I agree that the short form meaning "the value returned is saved to the backing field" is both more expected and more useful. The "yield" idea is ... interesting. I think personally I find it a bit too magic, and too cryptic to be more readable than an explicit assignment. Opinions may vary, though. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] is this thing on?
On Sun, 25 Feb 2024, at 20:02, Rob Landers wrote: > Before I get to the meat of this email, first of all, IMHO, anyone should be > able to email the list, even if they are not a > member of the list. I've had to email ubuntu lists about bugs before and I > really have no desire to join those lists, but > I was always able to just send the email to the list just fine. The biggest problem with an open list is how to manage spam - if you don't catch the spam on the list server, it not only ends up in hundreds of inboxes, but in multiple archives and mirrors of the list. I don't know how the lists you mentioned handle that. This has also come up in the past regarding moving from e-mail to $currently_fashionable_technology - having some barrier to entry is actually quite useful, since we want people to put some effort into their contributions beyond "me too" or "I had this crazy idea in the pub". Note that this is exactly why bugs.php.net was abandoned: there was too much spam and low-quality content. > Now for the issue: > > gmail is failing to send emails to the list (hence why it has probably been a > bit quite around here). Here is the error: > > The response from the remote server was: > 451 4.3.0 : Temporary lookup failure People are aware of this issue, and looking into it. In case you missed the previous thread, two things have unfortunately happened at once: - The mailing list was moved to a new server - GMail rolled out a much tighter set of anti-spam rules It's not immediately clear which of these is responsible for the 451 errors, but as I say, people are working on it. > Now, to go figure out how to unsubscribe this email from the list... Exactly the same way you subscribed, I believe: via the web form, or using +unsubscribe in the to address. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
On Thu, 22 Feb 2024, at 23:56, Larry Garfield wrote: > However, I just had a long discussion with Ilija and there is one > possibility we could consider: Use the return value only on the > shorthand (arrow-function-like) syntax. > > So you could do either of these, which would be equivalent: > > set { > $this->phone = $this->santizePhone($value); > } > > set => $this->santizePhone($value); Regarding this point, I've realised that the current short-hand set syntax isn't actually any shorter: set { $this->phone = $this->santizePhone($value); } set => $this->phone = $this->santizePhone($value); It also feels weird to say both "the right-hand side must be a valid expression" and "the value of the expression is ignored". So I think making the short-hand be "expression to assign to the implicit backing field" makes a lot more sense. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2
name{ get => new UnicodeString($this->name_string); set=> $this->name_string = (string)$value; } public string $name_string; This exotic "asymmetric typing" is then being used to justify other decisions - if you can specify setter's the type, it's confusing if you specify a name without a type; so we need to make the name optional as well... Compare to C#, where "value" is not a default, it's an unchangeable keyword; or Kotlin, where naming it is mandatory but doesn't have mention type. I think my concerns about distinguishing "virtual properties" may stem from a similar cause. In C#, all "properties" are virtual - as soon as you have any non-default "get", "set" or "init" definition, it's up to you to declare a separate "field" to store the value in. Swift's "computed properties" are similar: if you have a custom getter or setter, there is no backing store; to add behaviour to a "stored property", you use the separate "property observer" hooks. Kotlin's approach is philosophically the opposite: there are no fields, only properties, but properties can access a hidden "backing field" via the special keyword "field". Importantly, omitting the setter doesn't make the property read-only, it implies set(value) { field = value } The current RFC attempts to combine all of these ideas into one syntax, on top of everything the language already has. The result has some odd-shaped corners. For instance, this won't work: public string $name { set => throw new Exception('Read-only property ' . __PROPERTY__); } But this will: public string $name { set => throw new Exception('Read-only property ' . __PROPERTY__ . '; current value is: ' . $this->name); } The first declares a virtual property, with no default getter, like in C# or Swift. The second instead acts like Kotlin, and has a default getter referencing the implicit backing field. It would be clearer to choose one style or the other: explicitly enable the defaults... public string $name { get; set => throw new Exception('Read-only property ' . __PROPERTY__); } // default getter and backing field requested public string $name { get => $this->name ??= $this->generateName(); } // setter disabled because it's not mentioned, even though backing field is used ...or explicitly disable them: public string $name { set => throw new Exception('Read-only property ' . __PROPERTY__ } // implied default getter and backing field public virtual string $name { get => $this->firstName . ' ' . $this->lastName; } // setter disabled because property is declared virtual I think there's some really great functionality in the RFC, and would love for it to succeed in some form, but I think it would benefit from removing some of the "magic". Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Re: [RFC] OOP API for cURL extension
On 18 February 2024 15:26:37 GMT, Lynn wrote: > Having a lot of setters for options might make it really hard to find the >methods you're looking for in terms of auto-complete in your IDE. I think it would be significantly better for that purpose than what we have now, because there would be a lot fewer methods than there are current option constants. Firstly, because most of the methods would cover multiple overlapping or related options - e.g. setHttpMethod(string $method) covers CURLOPT_POST, CURLOPT_PUT, CURLOPT_CUSTOMREQUEST, and CURLOPT_HTTPGET; setBasicAuth($username, $password) combines CURLOPT_HTTPAUTH, CURLOPT_USERNAME, and CURLOPT_PASSWORD. Secondly, because some functionality that's not used as often can just be left to the curl_setopt equivalents forever, e.g. we don't need new methods for CURLOPT_DNS_SHUFFLE_ADDRESSES, CURLOPT_HAPPY_EYEBALLS_TIMEOUT_MS, etc, etc. The initial aim could be to cover, say, the 10 most commonly used settings - things like choosing the request method, and including custom request headers. Over time, we could add more methods for common tasks, but continue adding constants / enum cases for more obscure features of the library. > Would it >make sense to split options into a separate object (or perhaps multiple), >that could in theory also be shared between different CurlHandle instances? While I'm not against splitting things up into more objects, I think that becomes a much bigger task to define what goes in each, and harder to do half-way. My gut feeling is that it would descend into a lot of bikeshedding, and stop us making progress; whereas adding a few methods for common use cases could present a real quick win. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Re: [RFC] OOP API for cURL extension
On 17 February 2024 15:57:20 GMT, Larry Garfield wrote: >The RFC would also benefit greatly from some practical examples of using the >new API. Right now it's not clear to me (as someone who almost never uses >Curl directly) how/why I'd use any of these, since there's still "a whole >crapton of int constants I don't understand floating around." The suggestion >to use an Enum (or several) here is a good one and would help a lot with that, >so I'm +1 there. To my mind, the *eventual* aim should be that users don't *need* a userland wrapper just to make a simple request in a readable way, and that setting raw curl options becomes an advanced feature that most users never need. I know a lot of people's minds will immediately go to request and response objects, but I think we can go a long way by just making well-named methods wrapping one or two curl options each, so that you could write this: $ch = new CurlHandle('https://example.com'); $ch->setMethod('POST'); $ch->setRequestBody('{"stuff":"here"}'); $ch->setBasicAuth('admin', 'correct-horse-battery-staple'); $result = $ch->executeAndReturn(); Note that I am not saying every one of those methods needs to be added right now; adding a few at a time may be sensible to have time to discuss good names and signatures. But to me, renaming CURLOPT_POSTFIELDS to Curl\StringOptionsEnum::POSTFIELDS doesn't get us very far - users shouldn't need a raw curl setting for such a basic feature in the first place. Regards, -- Rowan Tommins [IMSoP]
[PHP-DEV] Re: [RFC] OOP API for cURL extension
On 16 February 2024 16:09:32 GMT, Rowan Tommins wrote: >public function executeAndReturn(): string >public function executeAndOutput(): void I guess I missed: public function executeToFile(Stream $fileHandle): void public function executeWithCallback(callable $wrIteFunction): void which would imply CURLOPT_FILE and CURLOPT_WRITEFUNCTION, respectively. From what I can see, these four modes are actually mutually exclusive (populating ch->handlers.write->method) with whichever option is touched last governing the actual behaviour of curl_exec(). For instance, setting CURLOPT_FILE to null or CURLOPT_RETURNTRANSFER to false always selects stdout mode, effectively clearing any value set with CURLOPT_WRITEFUNCTION. Having separate execute methods would make that much more obvious. Incidentally, I notice there is currently some code in _php_curl_verify_handlers where a bad stream in CURLOPT_FILE will fall back to writing the result to stdout. Is it me, or is that a really terrible idea, potentially exposing private data to the user? Should that scenario be promoted to an immediate false return in curl_exec, and Error in the new OO wrapper? Regards, -- Rowan Tommins [IMSoP]
[PHP-DEV] Re: [RFC] OOP API for cURL extension
On 15 February 2024 15:44:13 GMT, Sara Golemon wrote: >* CurlHandle::exec() mixed typing of return values. > Comment: Agreed. The `true` return value becomes meaningless in the >RETURNTRANSFER==false case. > Proposal: Update the RFC for CurlHandle::execute() to return ?string. Should we take this a step further, and remove CURLOPT_RETURNTRANSFER as a valid option on the object completely? Instead of an overloaded exec() method, provide: public function executeAndReturn(): string public function executeAndOutput(): void Perhaps the option could be accepted in the relevant setOpt methods, but issue a warning that it has no effect. Since both the default for the option and the name of the method are changing anyway, I don't think this significantly affects the migration effort for the tiny minority of cases where you actually want the direct output behaviour. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC] Deprecate implicitly nullable parameter type
On 22 January 2024 10:21:12 GMT, tag Knife wrote: >As you are mistaking `iint $var = null` params as "nullable". Which they >are not, they are "optional default" parameters. The feature which is being discussed is that, for the specific case of "= null", the parameter is made both optional *and* nullable. To make it clearer, the following all declare default values within the allowed type: int $foo = 1 int|string $foo = 'hello' ?int $foo = 1 ?int $foo = null The following all lead to a type error, because the default value isn't allowed for the declared type: int $foo = 'hello' int|string $foo = new DateTime; ?int $foo = 'hello' However, there is a special case: for purely historical reasons, a default of null is allowed *even when it doesn't match the declared type*: int $foo = null int|string $foo = null These are processed as though they were declared as nullable types; and the fix for the proposed deprecation would be to do so: ?int $foo = null int|string|null $foo = null The fact that the feature is tricky to explain is a good reason to deprecate it, and I think I support the proposal unless I see a good argument against. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: Fwd: [PHP-DEV] clarify the destructuring list() concept
On 5 January 2024 23:16:58 GMT, "Mönôme Epson" wrote: >I try to follow the procedure of: https://wiki.php.net/rfc/howto Then I suggest you take a moment to think about what you're proposing to change, and write it out clearly, as in "I propose to make this code ... do this ... instead of this ..." At the moment, your messages contain a lot of open-ended questions, and examples that you can easily look up the current behaviour of yourself. Keep in mind that there are millions of existing PHP applications which might be impacted by your change. In general, changing something which is currently an error is not a problem; but if the code currently succeeds, there is a chance that somebody somewhere is using it. You need to think about how the change will affect that existing code. >It seems to me that list() is not clearly specified. Do you mean that the documentation of the feature is not clear? Or, that there are cases where it's behaviour is inconsistent in some way? >list() supports destructuring assignment for arrays. Do you have an opinion >on object destructuring ? It might be interesting to allow an object on the right-hand side, and somehow specify properties to extract from it. I would be less interested in having an object on the left-hand side, which would presumably create an instance of stdClass, which I find pointless. Others might disagree. >*Do you think seeing list() as the reciprocal of a function call is >interesting ?* As I said previously, I would now always spell both array creation and array destructuring with [] not with array() and list(). Neither have ever behaved like function calls, and I don't think thinking of them that way is helpful. Instead, think of it as a way of specifying the content of an array, just like "hello $name" specifies the content of a string. Then think of array destructuring as the reverse of that construction: $arr = [$a, $b]; [$a, $b] = $arr; $arr = ['a' => $a, 'b' => $b]; ['a' => $a, 'b' => $b] = $arr; $arr = [1 => $b, 0 => $a]; [0 => $a, 1 => $b] = $arr; $arr = [$a, $b]; // keys 0 and 1 assigned by default [1 => $b, 0 => $a] = $arr; $arr = [1 => $b, 0 => $a]; [$a, $b] = $arr; // keys 0 and 1 taken by default $arr = ['a' => 1, $dynamicKey => 2]; [$dynamicKey => $two, 'a' => $one] = $arr; >Otherwise, how to use a default value, type hinting, nullable/optional >variable... It's not interesting ? Some of those could be useful in both array creation and array destructuring, but it's not as simple as copying the syntax of a function signature. Again, I suggest you write down a specific feature you are proposing to add, and think about how it would work. >There are many things that could be done with but don't work. For example : > >$array = [1, 2, 3]; >[...$values] = [...$array]; I can see this would be useful as a "rest of the values" syntax, as in: $arr = [1,2,3,4,5]; [$first, $second, ...$rest] = $arr; It couldn't work as the exact inverse of construction, though; this is allowed: $arr1 = [1, 2, 3]; $arr2 = [...$arr1, ...$arr1]; But this wouldn't make any sense: $arr2 = [1, 2, 3, 1, 2, 3]; [...$arr1, ...$arr1] = $arr2; Once again, a proposal of exactly how it would work would be interesting. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] clarify the destructuring list() concept
On 5 January 2024 20:44:00 GMT, "Mönôme Epson" wrote: >Hello internals, > >> The purpose of list() is to assign a list of variables. > >What should be the underlying concept behind the list() language construct? > >I propose that list() is the reciprocal of array(). >That is, if array() is a function call, then list() is the signature of a >function. > >Do you agree ? > >Regards, Alexandre > >PS: For historical reason, i propose to allow to syntaxe : >list(name: $name) = ['name'=>$name];// maybe discussed >list('name'=> $name) = ['name'=>$name]; I'm not sure what your question is, but the second option, with => as in all PHP contexts, already works: https://3v4l.org/dqgal Note that neither array() nor list() are functions, and both can be spelled [] as in ['name'=> $name2] = ['name'=>$name1]; https://3v4l.org/Uu5e4 The name for this if you want to find more information is "array destructuring". Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On 5 January 2024 12:18:51 GMT, Robert Landers wrote: >This is easy to handle from C. If the callback takes an argument, >don't fill in the super-globals. Again, that's compatible only in a narrow sense: it provides both APIs on any run-time which can do so safely. You still have an incompatible upgrade to make though: if you write code today for FrankenPHP, and directly use the super-global arrays it populates, you cannot take that code tomorrow and use it in Swoole, which does not provide those super-globals. If you write code today which uses callback parameters, you can take that code and use it unmodified with any system which provides those parameters - including async implementations. All that's missing for that to happen right now is a standard format for those parameters. > It allows legacy apps to be slowly >"upgraded" while allowing newer apps to take full advantage of a SAPI. It's actually quite easy to add most of the backwards compatibility needed for legacy apps in userland, by populating the superglobals, and running an output buffer to capture echo etc into the response. > However, if we go into the design with the >concurrent server story in mind, I think we can create something much >better than what is available from FrankenPHP. Precisely. That's why I used the phrase "forwards compatibility" - I'm not saying php-src needs to support all of this right now, just that *the API design* should have an eye on the future, not just the past. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On 5 January 2024 09:02:05 GMT, Robert Landers wrote: > I don't think they are fundamentally incompatible. If we look at >FrankenPHP's implementation, you pass a callback that gets called when >there is a request. No, you pass a callback which is called exactly once, for the next request. You have to implement your own loop if you want to handle multiple requests, which obviously isn't how it would work with an async event loop. That was one of my suggested changes: move the loop into C, so that the API was "callback called for each request". This actually *adds* flexibility on the server end, to decide how often to call that callback, do so asynchronously, etc. > Globals is how this works (atm) It's how it works for native SAPIs. It's not, as far as I know, how any worker system other than FrankenPHP has implemented its API. Every other implementation I've seen, whether async or not, passes in some form of request data to the callback, with the exception of RoadRunner, which gives the data as a return value from a "get next request" function. So, the second suggested change is to standardise on the most common pattern of passing parameters to a callback, rather than the unusual one of populating and clearing superglobals. As a bonus, this pattern works with both non-async and async workers. > changing the signature of the callback is generally backwards compatible This is true in the narrow sense that it won't cause any fatal errors. But if you write your application assuming that it will run in an environment where globals are populated for you, it will not run in an environment which no longer populates those globals. >Changing the underlying implementation in php-src when there are >native fibers/event loops probably won't even change anything (since >that was exactly how they were designed). Sounds great! So we don't need to wait to put that implementation in place then. >But holding up the entire conversation ... There is no reason whatsoever to hold anything up. The suggestion is not "don't implement any worker API until we have an async implementation", it's "a worker API sounds great, let's implement one that looks like this". Yes, it might take slightly longer to define some new array structures, but we're talking about a few hours work to give us a much more flexible system, not weeks of complex engineering. If the proposal is "copy some code from FrankenPHP into php-src, which nobody else will want to use", it's pointless; if it's "standardise an API with some enabling code", then *of course* we want to spend a bit of time designing that API. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On 5 January 2024 06:55:34 GMT, Robert Landers wrote: >I already said this, but to reiterate: I, personally, hear what you >are saying and largely agree with you; however, before we can really >have any kind of discussion on concurrent servers, we HAVE to address >the underlying issues that are missing from PHP. In PHP-src So, let's address them... > there are no such things as request objects This is a non-issue. As has been discussed already, it's perfectly fine to have an event-based system where the event details are an associative array, rather than a rich object. > There are no such things as event loops. There are fibers, > but absolutely no std-library i/o functions are using them This is what the bulk of Daniil's email is suggesting a way to improve. >We have a long way to go before those will be real things that we can >have a proper conversation about in the context of php-src. If we keep waiting to have the conversation, it will never happen. And if we start building brand new APIs like infrastructure for worker-mode SAPIs, in ways that are fundamentally incompatible with async, we're just making more work for ourselves when we do get there. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
I'm running out of different ways to say the same thing, and not really sure which part of my previous messages people haven't understood. I'm really not saying anything very controversial or complicated, just "had you considered that style B would offer these additional possibilities over style A". So I'm going to quote the parts of previous messages which I think already answer the latest questions. After that, I'm going to leave the thread to others for a bit, unless I see a question that isn't just retreading the same ground. On 01/01/2024 17:36, Pierre Joye wrote: Unless I misunderstand the current proposal, it is about providing a core interface to allow one to create its own SAPI similar to FrankenPHP, which does not handle request in a singe thread but a thread pool handled by go's coroutine. From Kévin's opening post: In addition to FrankenPHP, projects such as RoadRunner and Swoole provide engines supporting worker modes. [...] the existence of a common infrastructure would standardize the way worker scripts are created and provide a high-level PHP API for writing worker scripts that work with all SAPIs that rely on this new feature. From my last message: If we're attempting to standardise a new API for worker modes (i.e. HTTP servers which are no longer "shared nothing"), choosing one which can be used by consecutive worker modes (FrankenPHP , RoadRunner) but not concurrent ones (Swoole, ReactPHP, AMPHP) feels like a big missed opportunity. On 01/01/2024 17:40, Robert Landers wrote: I'm not sure concurrent servers would even be able to be in scope if we wanted them to be? From my message dated 2023-12-29 22:55 UTC: Note that both async and WebSockets were mentioned as possible "forward compatibility". If we're talking about "next generation SAPIs", these are the kinds of features that people will be - and already are - developing; so it seems foolish not to at least consider them when designing new baseline APIs. On 01/01/2024 17:36, Pierre Joye wrote: It is a first step and based on the usages/feedback, the next steps could be the second part of your comment. Or? From my message dated 2023-12-31 01:20 UTC: if you standardise on an API that populates global state, you close off any possibility of using that API in a concurrent environment. If you instead standardise on callbacks which hold request and response information in their own scope, you don't close anything off. And from 2023-12-30 10:53 UTC: The key requirement is that you have some way of passing the current request and response around as scoped variables, not global state. That's essential for any kind of concurrent run-time (async, thread-aware, etc). Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On 31 December 2023 16:31:31 GMT, Pierre Joye wrote: >php handles this in threadsafe mode Depending on your exact definition of "php", this is either irrelevant or just plain wrong. If you mean "the HTTP SAPIs shipped with official builds of PHP", then it's true, none handle multiple concurrent requests in a single thread using async I/O. But none handle multiple consecutive requests in a single thread using a "worker mode" either, which is the whole point of this conversation. If you mean for "php" to include third party HTTP handlers such as FrankenPHP, then it also includes Swoole, which is what I was describing. Please someone correct me if I'm wrong, but I understand ReactPHP and AMPHP also include HTTP servers using the same principle. So, to reiterate my point once more: implementations of PHP using async concurrency are out there already in production use. If we're attempting to standardise a new API for worker modes (i.e. HTTP servers which are no longer "shared nothing"), choosing one which can be used by consecutive worker modes (FrankenPHP , RoadRunner) but not concurrent ones (Swoole, ReactPHP, AMPHP) feels like a big missed opportunity. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On 31 December 2023 08:31:16 GMT, "Kévin Dunglas" wrote: >This new function is intended for SAPIs. Swoole was given as an example of >worker mode, but it isn't a SAPI. AFAIK, it doesn't use the SAPI >infrastructure provided by PHP. >The scope of my proposal is only to provide a new feature in the SAPI >infrastructure to build worker modes to handle HTTP requests, not to deal >with non-SAPI engines. One of the advantages you suggested of your proposal is that users would have a consistent way to write worker scripts. To achieve that, you want a *design* that can be adopted by as many implementations as possible, regardless of how they implement it. Providing helper infrastructure for that design is a secondary concern - as you admit, the actual code you're proposing to add is quite short. >That being said, I don't understand what would prevent Swoole from >implementing the proposed API Then one of us is missing something very fundamental. As I understand it, Swoole's model is similar to that popularised by node.js: a single thread processes multiple incoming requests concurrently, using asynchronous I/O. For instance, a thread might run the following: 01 Request A received 02 Request A input validated 03 Request A sends async query to DB 04 Request A hands control to event loop while it awaits result 05 Request B received 06 Request B sends async HTTP call to some API 07 Request B awaits result 08 Request A resumed with DB result 09 Request A formats and returns response 10 Request A complete 11 Request B resumed 12 Request B fornats and returns response Each request has its own call stack, started by a different call to the registered event handler, but any global state is shared between them - there is no actual threading going on, so no partitioned memory. If requests are communicated by setting up superglobals, that will happen at step 01 and again at step 05. If you try to read from them at step 09, you would see them populated with information about request B, but you're trying to handle request A. It would be possible to work around that by placing large warnings to users not to read superglobals after any async call - basically forcing them to create scoped copies to pass around. But the worse problem is output: if step 09 and step 12 both just use "echo", how do you track which output needs to go to which network connection? You can't just set up an output buffer, because that's global state shared by both call stacks. You have to put *something* into the scope of the call stack - a callback to write output, an expected return value, etc. Asynchronous code ends up somewhat resembling functional programming: everything you want to have side effects on needs to be passed around as parameters and return values, because the only thing isolated between requests is local variable scope. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On 30 December 2023 19:48:39 GMT, Larry Garfield wrote: >The Franken-model is closer to how PHP-FPM works today, which means that is >easier to port existing code to, especially existing code that has lots of >globals or hidden globals. (Eg, Laravel.) That may or may not make it the >better model overall, I don't know, but it's the more-similar model. That's why I said earlier that it provides better backwards compatibility - existing code which directly uses PHP's current global state can more easily be run in a worker which populates that global state. However, the benefit is marginal, for two reasons. Firstly, because in practice a lot of applications avoid touching the global state outside of some request bootstrapping code anyway. The FrankenPHP example code and Laravel Octane both demonstrate this. Secondly, because in an environment that handles a single request at a time, the reverse is also possible: if the server passes request information directly to a callback, that callback can populate the superglobals as appropriate. The only caveat I can think of is input streams, since userland code can't reset and populate php://input, or repoint STDOUT. On the other hand, as soon as you have any form of concurrency, the two models are not interchangeable - it would make no sense for an asynchronous callback to read from or write to global state. And that's what I meant about FrankenPHP's API having poor forward compatibility - if you standardise on an API that populates global state, you close off any possibility of using that API in a concurrent environment. If you instead standardise on callbacks which hold request and response information in their own scope, you don't close anything off. If anything, calling this "forwards compatibility" is overly generous: the OP gave Swoole as an example of an existing worker environment, but I can't see any way that Swoole could implement an API that communicated request and response information via global state. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On 30 December 2023 09:59:07 GMT, Robert Landers wrote: >For this to happen in PHP Core, there would need to be request objects >instead of a global state. Again, the representation as objects isn't a key requirement. Python's WSGI spec simply has a dictionary (read: associative array) of the environment based on CGI. The application might well turn that into a more powerful object, but standardisation of such wasn't considered a pre-requisite, and would actually have hampered ASGI, where not all events represent an HTTP request. The key requirement is that you have some way of passing the current request and response around as scoped variables, not global state. That's essential for any kind of concurrent run-time (async, thread-aware, etc). An event / subscriber model fits well with that: the local scope for each request is set up by an invocation of the callback with defined parameters and return value. Funnily enough, the example of a worker script for FrankenPHP does both things: it sends each request to the same application "handle" callback, passing in the super-global arrays as parameters to be used as non-global state. https://frankenphp.dev/docs/worker/#custom-apps So really all I'm arguing is that a few more lines of that PHP example be moved into the C implementation, so that the user only needs to provide that inner callable, not the outer while loop. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On 29/12/2023 21:14, Kévin Dunglas wrote: On Fri, Dec 29, 2023 at 8:14 PM Rowan Tommins wrote: - FrankenPHP expects the user to manage the main event loop ... This isn't exact. FrankenPHP does manage the event loop (the Go runtime manages it - through a channel - under the hood). Perhaps "event loop" was the wrong term; what I was highlighting is that to use FrankenPHP or RoadRunner, you have to write a while loop, which explicitly handles one request at a time. In Swoole, there is no such loop: you register event handlers and then call $server->run() once. Similarly, WSGI mandates that the server "invokes the application callable once for each request it receives from an HTTP client". It's a distinction of pull/poll (the application must actively block until next request) vs push/subscribe (the application is passively invoked whenever needed). I already replied to Crell about that. It will totally possible to expose more complex HTTP message objects in the future, but PHP currently lacks such objects. The only things we have are superglobals (which are more or less similar to CGI variables, as done in WSGI) and streams. It's why we're using them. The use of objects vs arrays wasn't the main difference I was trying to highlight there, but rather the overall API of how information gets into and out of the application. FrankenPHP is the only server listed which needs to reset global state on each request, because the others (including Python WSGI and ASGI) use non-global variables for both input and output. I notice that the Laravel Octane adaptor for FrankenPHP takes that global state and immediately converts it into non-global variables for consumption by the application. I'm not sure what you mean by "async PHP environment". OpenSwoole, AMPHP, ReactPHP, etc - servers which expose concurrency directly to the user of PHP. In those environments, global state isn't just reused between consecutive requests, it's shared between multiple requests running concurrently, so a global "current request" and "current response" have no meaning. WebSockets and WebTransport are a different kind of beast, they are much lower level than HTTP and will require a different API anyway (and probably a lot of other adaptations in core) to be supported in PHP. WebSocket support in PHP is just as real as worker modes and asynchronous concurrency. Swoole has a WebSocket implementation included in core [https://openswoole.com/docs/modules/swoole-websocket-server] and Roadrunner has a plugin for it [https://roadrunner.dev/docs/plugins-centrifuge/current] In both cases (and in ASGI), the same basic API is used as with HTTP, but using a more general concept of "events" in place of "requests". Other PHP implementations include Ratchet [http://socketo.me/] and AMPHP Websocket Server [https://github.com/amphp/websocket-server]. Note that both async and WebSockets were mentioned as possible "forward compatibility". If we're talking about "next generation SAPIs", these are the kinds of features that people will be - and already are - developing; so it seems foolish not to at least consider them when designing new baseline APIs. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On 23/12/2023 20:34, Kévin Dunglas wrote: In addition to sharing code, maintenance, performance optimization, etc., the existence of a common infrastructure would standardize the way worker scripts are created and provide a high-level PHP API for writing worker scripts that work with all SAPIs that rely on this new feature. While this seems like a noble aim, there doesn't seem to be much consensus on what such an API should look like; from what I can see: - FrankenPHP expects the user to manage the main event loop, repeatedly passing the server a function to be called once; it doesn't pass anything into or out of the userland handler, instead resetting global state to mimic a non-worker environment [https://frankenphp.dev/docs/worker/#custom-apps] - RoadRunner doesn't use a callback at all, instead providing methods to await a request and provide a response; it directly uses PSR-7 and PSR-17 objects [https://roadrunner.dev/docs/php-worker/current/en] - OpenSwoole manages the main loop itself, and uses lifecycle events to interface to userland code; the HTTP 'Request' event is passed custom Request and Response objects [https://openswoole.com/docs/modules/swoole-http-server-on-request] It also seems relevant to mention the situation in Python: - WSGI specifies a Python-level interface between a web server and a web application / framework. The server side is expected to provide the event loop (unlike in FrankenPHP), and passes the application an environment dictionary (based on CGI) and a start_response callback. [https://peps.python.org/pep-/] - The newer ASGI generalises this interface into an asynchronous event handling system, including support for WebSockets. [https://asgi.readthedocs.io/en/latest/introduction.html] Out of all of these, the FrankenPHP approach seems to be the most basic, providing good backwards compatibility with PHP's normal "shared nothing" approach, but not much forwards compatibility - I can't see how it would be adapted for an async PHP environment, or with WebSockets, for instance. I'm sceptical how many SAPIs would actually implement it, rather than providing more powerful APIs. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC][Discussion] NotSerializable attribute
On 9 December 2023 12:30:29 GMT, Max Semenik wrote: >Hi, I'd like to propose a new attribute, #[NotSerializable]. This >functionality is already available for internal classes - userspace should >benefit from it, too. If this ends up approximately the same as implementing serialisation as an exception, it feels quite a thin feature. If you put __sleep and __wakeup as shown into a trait, it's already as short and explicit as "use NotSerializable;" What would make it more compelling is if the engine itself could do more with the attribute. For instance, a direct isSerializable() on ReflectionClass that covered both internal and attribute-marked classes. It would also be useful to have some interface for classes that are *sometimes* serializable, because they contain open-ended collections of other objects. An example being exceptions, which may collect objects as part of the backtrace information. Such a class could iterate its contained objects, checking if they are unserializable classes, or classes which should recursively be asked if the instance is serializable. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Adding a donate link to the PHP website
Nicolas Grekas wrote: >There's one important piece missing in your analysis: > >https://docs.github.com/en/site-policy/github-terms/github-terms-of-service#6-contributions-under-repository-license Note the second paragraph there: > Isn't this just how it works already? Yep. This is widely accepted as the > norm in the open-source community; it's commonly referred to by the shorthand > "inbound=outbound". We're just making it explicit. My reading of Ben's analysis is that not only is this widely accepted by the community, it's widely accepted by the legal system as well. So contributing via GitHub isn't doing anything extra here, they're just reminding their users, like statements of "all rights reserved" or "this does not affect your statutory rights". As to the main thrust of the thread: I agree with Larry's last email: both donation and licensing changes seem sensible steps forward. The only note of caution I would throw in is that the recent Technical Committee proposal [1] was soundly rejected, so any *organisational* changes are likely to be a lot more contentious. So care should be taken to separate those from purely legal or financial links. [1] https://wiki.php.net/rfc/php_technical_committee Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Callable arguments cannot have default value
On 28/11/2023 09:54, Claude Pache wrote: The big problem with the `callable` type, is that it can be check only at runtime. For instance: ```php function foo(callable $x) { } foo('strlen'); // ok foo('i_dont_exist'); // throws a TypeError ``` To expand on this example, and address the original question more explicitly, consider if we allowed this: function foo(callable $x = 'maybe_exists') { } To decide whether that's a valid definition, the compiler needs to know whether 'maybe_exists' can be resolved to the name of a global function; but it might be defined in a different file, which hasn't been included yet (or, more generally, which isn't being compiled right now). To allow the default, the engine would need to defer the validity check until the function is actually executed. This is how "new in initializers" works [https://wiki.php.net/rfc/new_in_initializers] and we can actually use that feature to implement a default for callable parameters: ```php class WrappedCallable { // Note: can't declare callable as the property type, but can as an explicit constructor parameter private $callable; public function __construct(callable $callable) { $this->callable = $callable; } public function __invoke(...$args) { return ($this->callable)(...$args); } } function test(callable $f = new WrappedCallable('strlen')) { echo $f('hello'); } test(); ``` Using this wrapper, we can pass in any value which is itself valid in an initializer, including callables specified as 'funcname' or ['class', 'staticmethod']. The trick is that we're not actually evaluating that value as a callable until we invoke test(), at which point the constructor of WrappedCallable performs the assertion that it's actually callable. So this compiles: function test(callable $f = new WrappedCallable('i_dont_exist')) { echo $f('hello'); } But will then error at run-time, *unless* a global function called i_dont_exist has been defined before that call. It seems like it would be feasible for the engine to do something similar natively, creating an equivalent of WrappedCallable('i_dont_exist') using the first-class callable syntax: function test(callable $f = i_dont_exist(...)) { echo $f('hello'); } Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] [RFC][Discussion] Why can constructors violate LSP?
On 23 November 2023 20:31:09 GMT, Robert Landers wrote: >I'd like to propose an RFC to enforce the covariance of constructors >(just like is done for other methods), to take effect in PHP 9, with a >deprecation notice in 8.3.x. There's a lot more than visibility that is enforced on normal methods, but isn't on constructors. For instance this is also valid: class A { public function __construct(int $foo) {} } class B extends A { public function __construct(string $bar) {} } From a theoretical perspective, I think the argument is roughly that classes aren't first-class citizens that you can pass around, so substitutability doesn't apply. You can't for instance write a function that explicitly depends on "a class definition inheriting from A", like this: function foo(class $class) { $instance = new $class(42); } You can certainly simulate such code with some strings and maybe a bit of reflection, but the language isn't going to make any guarantees about it. I did just think of a counter-example, though, which is that "new static($param)" is allowed, even though there's no way to know if $param will be accepted by subclasses. Maybe it shouldn't be allowed? From a practical point of view, it's often very useful to sub-class something and provide a constructor with a different signature. Maybe your subclass has additional dependencies; maybe it can hard-code or calculate some of the inputs to the parent constructor for a special case, etc. A private constructor can be used in conjunction with static methods to simulate multiple named constructors (createFromString, createFromRequest, etc). Given the lack of other guarantees, there's no particular gain in preventing that just because the parent class has a public constructor. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC Proposal - static modifier for classes
On Thu, 23 Nov 2023 at 14:20, Stephen Reay wrote: > > Out of the box, with no package manager or module loader or anything, you > can do as little as call `spl_autoload_register` with no arguments, and > have working class autoloading. > Sure, it's just about possible that you could paste that one line into each of your PHP files, and they'd all pick up the right classes. But it's more likely that you have some shared include called "startup.php", "boostrap.php", "header.php", etc, which has that line plus a bunch of other setup. So, the argument goes that adding a line "require 'namespace_foo_functions.php';" to that shared include isn't that big a problem. (If it happens you are using Composer, the files and config are listed in a JSON file rather than a PHP one, and Composer generates the shared include, but the logic's basically the same.) The strongest counter-argument, I think, is that it *scales* badly: once you have more than a handful of such include files, an autoloader solution becomes more attractive than listing all the files. However, in itself that's not an argument for static classes; it's an argument for function autoloading; the argument for static classes needs to be either: - "admit it, we're never going to get function autoloading"; - or "I'd want this even if we had function autoloading" (in which case the discussion of require vs autoloading becomes irrelevant). Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] RFC Proposal - static modifier for classes
On Thu, 23 Nov 2023 at 11:48, Stephen Reay wrote: > > Respectively, I disagree that it's "not a big problem" if your goal is to > encourage people to use regular functions over classes with static methods. > Just to be clear, my answer was specifically addressing your point about using Composer as an argument for not including things. I was not saying "... and therefore the argument is true", only "... and therefore we can discuss the argument without mentioning Composer if we want to". > PHP ships with a built in class autoloader function, and pretending that using 'require_once' everywhere a function is used, is just as easy for the developer seem disingenuous to be honest. PHP ships with *the ability to configure* an autoloading function; it will not load any files without you first telling it where to look. The workaround being proposed is not to use require_once every time you want a function, it's to use require_once in the same place you configure your autoloader. I totally agree that we can debate whether that workaround is sufficient. I'm just trying to frame that debate as "autoloading vs require", rather than a distraction of "Composer vs something else". Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Re: [RFC][Discussion] Harmonise "untyped" and "typed" properties
On Thu, 23 Nov 2023 at 08:48, Nicolas Grekas wrote: > Sorry this comes as a surprise to you but you're rewriting history here. > The current behavior, the one that was fixed in that commit, matches how > PHP behaved before typed properties, so this commit brought consistency. > The question of "what does __get do with a property that has been declared but not assigned a value?" has no answer before PHP 7.4, because that situation simply never happened. So there isn't really one answer to what is "consistent". If I understand rightly, your position, and Nikita's, is that the behaviour should be consistent with the statement "if you have declared a property, access doesn't trigger __get unless you explicitly call unset()". This is what the change that slipped into 7.4.1 "fixed". The reason it surprised me is that I expected it to be consistent with a different statement: "if you have an __get method, this takes precedence over 'undefined variable' notices/warnings/errors". That statement was true in 7.3, and still true in the original implementation in 7.4.0 (including "unitialized" alongside "undefined"), but was *broken* by the change in 7.4.1: now, the "uninitialized property" error takes precedence over __get in the specific case of never having assigned a value. > About the behavior, it's been in use for many years to build lazy proxies. > I know two major use cases that leverage this powerful capability: Doctrine > entities and Symfony lazy services. There are more as any code that > leverages ocramius/proxy-manager relies on this. > Just to be clear, it is not the behaviour *after* calling unset which I am concerned about, it is the behaviour *before* calling unset or assigning any value. I was aware of the interaction with __get, but wrongly assumed that the rule was simply "uninitialized properties trigger __get". > About the vocabulary, the source tells us that "uninitialized" properties > that are unset() become "undefined". I know that's not super accurate since > a typed property is always defined semantically > Just "undefined" is not sufficiently unambiguous; you have to distinguish four different states: 1) Never declared, or added dynamically and then unset 2) Declared without a type, then unset 3) Declared with a type, not yet assigned any value 4) Declared with a type, then unset The messages presented to the user refer to both (1) and (2) as "undefined", and both (3) and (4) as "uninitialized". As it stands, the RFC would replace all instances of (2) with (4), but that still leaves us with two names for three states. Claude Pache wrote: > However, it is not a problem in practice, because users of classes implementing (1) but not (2) do not unset declared properties, ever. Nikita Popov wrote: > ... and then forbid calling unset() on declared properties Right now, unset() is also the only way to break a reference, other than another assign-by-reference, so it's quite reasonable to write "unset($this->foo)" instead of "$this->foo" to ensure a property is truly reset to a known state. Maybe we need a new function or operator to atomically break the reference and assign a new value, e.g. unreference($this->foo, 42); or $this->foo := 42; to replace unset($this->foo); $this->foo=42; Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] RFC Proposal - static modifier for classes
On Thu, 23 Nov 2023 at 06:00, Stephen Reay wrote: > I'm disappointed to see yet again that there's this implied notion that > working with PHP in 2023 means "well surely you must be using composer", > which leads to "but composer..." somehow being an accepted argument when > it comes to missing/incomplete builtin functionality. > While I appreciate your point in the general case, in this particular thread, the mentions of Composer are really just examples, or can be reworded that way: Functions lack autoloading, but in practice this isn't a big problem because you can just require_once a file defining them, and as long as OpCache is running there's very little performance penalty. If you're using a package manager or module loading system to integrate multiple autoloaders, it's generally easy to add one or more required files as part of the package / module config - *for example* Composer has a "files" array in each package's "autoload" config. So the actual assumption is "surely you must be using OpCache", which unlike Composer is bundled with PHP. Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Re: [RFC][Discussion] Harmonise "untyped" and "typed" properties
On 23 November 2023 01:37:06 GMT, Claude Pache wrote: >What you describe in the last sentence is what was initially designed and >implemented by the RFC: https://wiki.php.net/rfc/typed_properties_v2 (section >Overloaded Properties). > >However, it was later changed to the current semantics (unset() needed in >order to trigger __get()) in https://github.com/php/php-src/pull/4974 Good find. So not only is it not specified this way in the RFC, it actually made it into a live release, then someone complained and we rushed out a more complicated version "to avoid WTF". That's really unfortunate. I'm not at all convinced by the argument in the linked bug report - whether you get an error or an unexpected call to __get, the solution is to assign a valid value to the property. And making the behaviour different after unset() just hides the user's problem, which is that they didn't expect to *ever* have a call to __get for that property. But I guess I'm 4 years too late to make that case. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Re: [RFC][Discussion] Harmonise "untyped" and "typed" properties
On 22 November 2023 14:12:09 GMT, Nicolas Grekas wrote: >I think there is an inaccuracy that needs to be fixed in the after-unset >state : as noted later in the RFC, magic accessors are called after an >unset($this->typedProps). This means the state cannot be described as >identical ("uninitialized') before and after unset() in the first table in >the RFC. Isn't there some vocabulary in the source that we can use to >describe those states more accurately? Oh. Wow. That's more than just inaccurate terminology... I always assumed the rule was "access to uninitialised properties triggers __get", not that there was yet another magical state buried in the implementation. From a user point of view, I find that frankly terrible: > Typed properties start off as uninitialized, but if you use unset(), you can > make them *super-uninitialized*. > > There's no way to actually see if something's uninitialized or > super-uninitialized; and once you've assigned a value, you can't go back to > the original uninitialized, only to super-uninitialized. > > Accessing an uninitialized property always throws an error, whereas accessing > a super-uninitialized property will first check for __get. I'm not sure choosing a different name from "super-uninitialized" makes much difference to how that reads. I'm probably going to regret asking this, but is there some reason it works that way? Is there any chance of changing it to just: > Typed properties start off as uninitialized. > > Once you've assigned a value, you can't go back to the original uninitialized > state using unset() > > Accessing an uninitialized property will first check for __get, and throw an > error if that isn't defined. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Debian Upgrade
On 21/11/2023 12:14, Derick Rethans wrote: I have fixed this now. It turned out to be a bug in *bugs.php.net*, which I had fixed yesterday... Wow, that was an unexpected chain of dependencies! Thanks for tracking it down. :) Regards, -- Rowan Tommins [IMSoP]
[PHP-DEV] Re: [RFC][Discussion] Harmonise "untyped" and "typed" properties
On 16/11/2023 20:41, Rowan Tommins wrote: Hi all, I have finally written up an RFC I have been considering for some time: Harmonise "untyped" and "typed" properties RFC URL: https://wiki.php.net/rfc/mixed_vs_untyped_properties I've revised the RFC; it now proposes to keep the implicit "= null" for untyped properties, although I'm still interested in suggestions for other strategies around that. I have also added discussion of variance checks (thanks Claude for the tips on that). While doing so, I checked Reflection, and am unsure how to proceed. Currently ReflectionParameter shows a difference between "function foo($bar)" and "function foo(mixed $bar)", even though these are analysed as equivalent in inheritance checks. Should ReflectionProperty also retain this distinction? Was the possibility discussed when "mixed" was introduced of using a ReflectionType of mixed for both cases? Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Debian Upgrade
On 15/11/2023 16:59, Derick Rethans wrote: In the last few days, I have upgrade all our Digital Ocean droplets from Debian 10 (or 9!) to 12. That also means they now run PHP 8.2. Hi Derick, Thanks for getting things current! System maintenance is such an essential but often underappreciated task. I notice news.php.net (and therefore externals.io, which feeds from it) hasn't copied any message from the list since the day after you posted this (last timestamp is 16 Nov 2023 22:14:19 -). Could that be related somehow? Or if it's just coincidence, maybe you or someone else here knows which service might need prodding to bring it back to life? Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] RFC Proposal - static modifier for classes
On 20 November 2023 18:53:50 GMT, Deleu wrote: >> It's probably not productive to just say "the people who voted last time >> are wrong", but it was long enough ago that a new RFC on the topic wouldn't >> break any rules. > > >9 years is long enough to conclude that whatever happens, they weren't >really wrong back then. Absolutely; to make sure the point is clear, I meant that just saying "they were wrong", or even "they would be wrong if they said that now" is not a *sufficient* argument. You either need to explain *why* the situation has changed, or *why* the arguments were not sufficiently considered in the first place. Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC Proposal - static modifier for classes
On 20 November 2023 08:35:15 GMT, Lanre Waju wrote: >1. I will personally implement this feature. That's good to hear, but the initial implementation is not the main cost of a new feature. Once we add something, it's very hard to remove, and every future change has to consider that feature and make sure it doesn't break. That's why "do we need X when Y does nearly the same thing" is a more valid argument (in general) than you're giving it credit for: if we included every variation of every feature, the language and its implementation would become unmanageably complex, so we have to choose where to draw the line. It's up to the person proposing a feature to persuade others that it falls the right side of that line - that the benefit of the feature outweighs the cost of having it in the language. (It's also worth noting that the previous proposal also had an implementation, linked at the bottom of the RFC.) It's probably not productive to just say "the people who voted last time are wrong", but it was long enough ago that a new RFC on the topic wouldn't break any rules. So, if you want to proceed with this, you can try to come up with a justification that addresses the points raised previously, and follow the process here: https://wiki.php.net/rfc/howto Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC Proposal - static modifier for classes
On 19 November 2023 21:28:08 GMT, Lanre Waju wrote: >Hi, similar to the abstract and readonly modifiers to classes (similar in >syntax only), I propose a class level "static" modifier that ensures: Hi Lanre, There was a proposal for this several years ago which was declined at the voting stage: https://wiki.php.net/rfc/abstract_final_class That doesn't mean we can't look again, but any new proposal would need to at least address the reasons the previous one was declined. I believe these are the relevant discussion threads: https://externals.io/message/79211 https://externals.io/message/79338 https://externals.io/message/79601 My memory is that one of the main points against was that a class with only static methods is just a namespace, and you can already put functions directly in a namespace. The only issue being that we don't have good autoloading support for such functions, and that's a whole different problem... Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] [RFC][Discussion] Harmonise "untyped" and "typed" properties
On 17 November 2023 13:30:42 GMT, Claude Pache wrote: > >Yes, except that an untyped (respectively `mixed`) property cannot be >redeclared as `mixed` (resp. untyped) in a subclass. A small step in the right >direction is to allow that. Huh, I didn't know that. I'll add it to the RFC, at least to consider. The RFC to add "mixed" gives an example of removing the type as invariance, but doesn't seem to justify why "untyped" and "mixed" should be considered different, from a type system point of view. https://wiki.php.net/rfc/mixed_type_v2 Regards, -- Rowan Tommins [IMSoP] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php