from:"Rowan Tommins \[IMSoP\]"

Re: [PHP-DEV] Fwd: Request for RFC Karma to Propose any_empty and all_empty Methods

2024-06-04 Thread Rowan Tommins [IMSoP]


On 27/05/2024 17:56, Bilge wrote:

On 27/05/2024 17:51, Elminson De Oleo Baez wrote:

Below is a brief overview of the proposed methods:

any_empty(array $array): bool - This method will return true if any 
element in the provided array is empty, and false otherwise.
all_empty(array $array): bool - This method will return true if all 
elements in the provided array are empty, and false otherwise.

Dude... what? https://wiki.php.net/rfc/array_find



Please try to be polite, and remember that just because something seems 
obvious to you, that doesn't mean everyone else in the world already 
knows it.


How about we re-word with that in mind:


Hi Elminson,

As it happens, there has been a recent proposal which includes 
"array_any" and "array_all" functions, which could be combined with the 
existing "empty" function to achieve these results.


You can find the RFC here: https://wiki.php.net/rfc/array_find and the 
mailing list discussion here: https://externals.io/message/123015


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [Discussion] Implicitly backed enums

2024-05-23 Thread Rowan Tommins [IMSoP]


On 22/05/2024 00:31, Larry Garfield wrote:

I could see an argument for auto-populating the backing value off the enum name 
if it's not specified, something like this:

enum Options: string {
   case First; // This implicitly gets "First"
   case Second = '2nd';
}



This reminds me of the short-hand key-value syntax that JavaScript 
allows, and people have occasionally requested equivalents for in PHP, 
where { foo } is equivalent to { 'foo': foo }


The downside I see to all such short-hands is that they make it much 
harder to refactor safely, because the identifier and the string value 
are tied together.


For instance, maybe you want to rename Options::First to Options::Legacy 
and Options::Second to Options::Modern so you edit the enum, and find 
all references in code:


enum Options: string {
  case Legacy;
  case Modern = '2nd';
}

But now everywhere you've serialized the old value of "First" is going 
to break, because the first case now has the implicit backing value of 
"Legacy" instead!


To avoid this, you have to go ahead and specify all the backing values:

enum Options: string {
  case Legacy = 'First';
  case Modern = '2nd';
}

Having to specify both the name and value in the first place makes that 
decision much more obvious, for what seems to me to be very little 
up-front cost.


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [Discussion] "Internal" attribute and warning

2024-05-18 Thread Rowan Tommins [IMSoP]

On 18 May 2024 16:25:06 BST, Robert Landers  wrote:
>I thought about that too, but in-general, a vendor has the knowledge
>and capability to ensure any two packages work together (like Doctrine
>plugins in your example).

How do they achieve that "knowledge and capability" other than documentation, 
and tooling making use of that documentation?

Doctrine DBAL and Doctrine ORM are both large open-source codebases, which 
happen to have a dependency relationship, and also happen to have the same 
vendor namespace. Documentation and warnings about using internal 
functions/classes of the DBAL would be just as useful to a developer of the ORM 
as they would be to an application developer.

As another example, within the completely private codebase I work on 
professionally, we have shared modules, parts of which are intended to be 
implementation details and not subject to compatibility guarantees. It would be 
really useful to get an automatic notification if those were used in other 
parts of our codebase, but all of our code shares the same vendor namespace, so 
a single-level #[Internal] attribute would be entirely useless. 

Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [DISCUSSION] Checking uninitialized class properties

2024-05-18 Thread Rowan Tommins [IMSoP]

On 18 May 2024 17:13:49 BST, Larry Garfield  wrote:
>However, that breaks down with readonly properties, which are not allowed to 
>have a sentinel.  Uninitialized is their sentinel, for better or worse.

Sorry, I don't understand this statement at all. A readonly property can be set 
to PatchState::KeepCurrentValue just like any other. If the intention is that 
that state will be overwritten with an actual value later, then it's not a 
readonly property. 

I guess you have some different scenario in mind?

> And as I noted earlier in the thread, when writing a serializer or other 
> dynamic systems (an ORM probably would have the same issue), you really need 
> to be able to differentiate between null and uninitialized.  Even if you 
> think the uninitialized value is a sign of an error, it's coming from code 
> you don't control so you have to be able to handle it somehow.

If a property is uninitialized, the object is in an invalid state, and 
attempting to read that property gives an error. That's by design, and as it 
should be.

Are you saying that you want to be able to detect the error before it happens? 
Why?

Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [Discussion] "Internal" attribute and warning

2024-05-18 Thread Rowan Tommins [IMSoP]


On 18/05/2024 15:00, Robert Landers wrote:

I've been thinking about having an "internal" attribute that will emit
a warning if called from outside it's left-most namespace.



I like the general idea, but I don't think limiting to "left-most 
namespace" is the best semantics.


It's very common for the top-level namespace to represent a vendor, and 
the *second* level to be the specific package, e.g. Doctrine\DBAL vs 
Doctrine\ORM. You've even used that in your example - I presume you've 
made a typo, and meant both examples to be calling PackageA not 
PackageB. In other cases, there are more levels - e.g. Composer package 
"doctrine/mongodb-odm" has root namespace "Doctrine\ODM\MongoDB".


Possibly the attribute would need some argument to specify its 
granularity, e.g.  #[Internal('\MyCompany\PackageA')], 
#[Internal('\Doctrine\ODM\MongoDB')], but that would be annoying to 
write each time.


This is another case where PHP suffers from its lack of a separate 
concept of "package" or "module" to scope things to.



My second concern is how to implement this efficiently. The check can't 
happen at compile-time, because we don't know the definition of 
SomeOtherNamespace\Foo; so the check would need to be at run-time when 
the method/function is called. But at run-time, namespaces have very 
little existence - they really are just part of the names of functions, 
classes, and constants.


So when calling a marked function, we would have to look up the name of 
the calling function or the class name of the calling method, and then 
do a string comparison against the namespace constraint. Maybe that 
would be easy and fast, I don't know.


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [DISCUSSION] Checking uninitialized class properties

2024-05-18 Thread Rowan Tommins [IMSoP]


On 18/05/2024 11:52, Luigi Cardamone wrote:

I am already using a solution like the one
proposed by Robert: it works but it
leaves some doubts since each project
can have a different name for NotSet



An argument is often made that this is a good thing, in the sense that 
"null" and other "universal" terminal values combine multiple meanings 
in an unhelpful way. You'll see this frequently regarding SQL's handling 
of NULL, for instance - does it mean "unknown", "not applicable", 
"invalid", etc.


A common solution put forward to this perceived problem is "algebraic 
data types" - a Maybe or Option type for "value or missing", an Error or 
Failable type for "value or error", etc. PHP doesn't have those 
(yet...), but the same information can be conveyed nicely with a final 
class or single-element enum.


In your example, the actual value you want to represent is not "Not 
Set", it's "Keep Current Value", so that's the terminal value you need:


final class KeepCurrent {}
class MyDTOPatch {
    public int|null|KeepCurrent $propA;
    public int|null|KeepCurrent $propB;
}

or:

enum PatchState { case KeepCurrent }
class MyDTOPatch {
    public int|null|PatchState $propA;
    public int|null|PatchState $propB;
}




Are there any downsides in adding a
specific syntax to check if a property
is initialized with any value?



In my opinion - and I stress that others may not share this opinion - 
the entire concept of "uninitialized properties" is a wart on the 
language, which we should be doing our best to eliminate, not adding 
more features around it.


As a bit of background, the concept was created when typed properties 
were being added, to handle a limitation of the language: given the 
declaration "public Foo $prop;" there is no way to specify an initial 
value which meets the type constraint. For nullable properties, you can 
write "public ?Foo $prop=null;" but since PHP (thankfully) distinguishes 
nullable and non-nullable types, you can't write "public Foo $prop=null;"


Some languages, e.g. Swift, require that all properties are initialised 
before the constructor returns, but retrofitting this to PHP was 
considered impractical, so instead it was left to a run-time error: if 
you fail to initialise a property, you will get an error trying to 
access it.


To track that, the engine has to record a special state, but assigning a 
meaning to that error state is a bit like using exceptions for flow 
control. It would be more in keeping with the original purpose to have 
an object_is_valid() function, which returned false if *any* property 
had not been initialised to a valid value.



PHP actually has a bewildering variety of such special states. In a 
different compromise added at the same time, calling unset() on a typed 
property puts it into a *separate* state where magic __get and __set are 
called, which they are not if the property has simply not yet been 
assigned, e.g. https://3v4l.org/C7rIF


I have always found this a mess. If a property says it is of type 
"?int", I want to know that it will always be an integer or null, not 
"int or null or uninitialised or unset". If it needs more than one 
non-integer state, that should be specified in the type system, e.g. 
"int|NotApplicable|NotSpecified|NotLoaded".



PS: Etiquette on this list is to post replies below the text you're 
replying to, preferably editing to the relevant parts as I've done here, 
rather than adding your text above the quoted message.


Required,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC] Transform exit() from a language construct into a standard function

2024-05-11 Thread Rowan Tommins [IMSoP]

On 11 May 2024 15:43:19 BST, "Gina P. Banyard"  wrote:
>print, echo, include(_once) and require(_once) do not mandate their "argument" 
>to be passed within parenthethis, so making them functions does not simplify 
>the lexer/parser nor removes them as keywords.

It's actually a much stronger difference than that: parentheses are not parsed 
as surrounding the argument lists for those keywords at all.

A while ago, I added notes to the manual pages of each showing how this can 
lead to misleading code, e.g. one of the examples on https://www.php.net/print 
is this: 

print(1 + 2) * 3;
// outputs "9"; the parentheses cause 1+2 to be evaluated first, then 3*3
// the print statement sees the whole expression as one argument

echo has further peculiarities, because it takes an unparenthesised list of 
arguments, and can't be used in an expression.

While it would probably have been better if those had been parsed like 
functions to begin with, changing them now would not just be pointless, it 
would be actively dangerous, changing the behaviour of existing code. 

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type

2024-05-05 Thread Rowan Tommins [IMSoP]

On 30 April 2024 11:16:20 GMT-07:00, Arvids Godjuks  
wrote:
>I think setting some expectations in the proper context is warranted here.
>
>1. Would a native decimal type be good for the language? I would say we
>probably are not going to find many if any people who would be against it.

As I said earlier, I don't think that's the right question, because "adding a 
native type" isn't a defined process. Better questions are: Should a decimal 
type be always available? Does a decimal type need special features to maximise 
performance? Should we have special syntax for a decimal type? What functions 
should support a decimal type, or have versions which do?

>2. Is there a need for it? Well, the whole world of e-commerce, accounting
>and all kinds of business systems that deal with money in PHP world do not
>leave any room for doubt - https://packagist.org/?query=money . The use
>case is right there :)

That's a great example - would a decimal type make those libraries redundant? 
Probably not - they provide currency and rounding facilities beyond basic 
maths. Would those libraries benefit from an always-available, high-performance 
native type? Certainly. 

Would they benefit from it having strong integration into the syntax and 
standard library of the language? Not really; there's a small amount of actual 
code dealing with the values.

>4. Is it a lot of engine work?

Only if we go for the maximum ambition, highly integrated into the language. 

> Is it worth it? 

I'm actually not convinced.

>5. But BCMath/GMP/etc!!! Well, extensions are optional.

Extensions are only optional if we decide they are. ext/json used to be 
optional, but now it's always-on.

> They are also not as fast and they deal with strings.

Not as fast as what? If someone wants to make an extension around a faster 
library, they can. And only BCMath acts directly on strings; other libraries 
use text input to create a value in memory - whether that's a PHP string or a 
literal provided by the compiler doesn't make much difference.

I absolutely think there are use cases for decimal types and functions; but "I 
want a faster implementation" and "I want to add a new fundamental type to the 
language, affecting every corner of the engine" are very different things.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type

2024-04-30 Thread Rowan Tommins [IMSoP]

On 28 April 2024 07:47:40 GMT-07:00, Robert Landers  
wrote:

>I'm not so sure this could be implemented as an extension, there just
>isn't the right hooks for it.

The whole point of my email was that "this" is not one single feature, but a 
whole series of them. Some of them can be implemented as an extension right 
now; some could be implemented as an extension by adding more hooks which would 
also be useful for other extensions; some would need changes to the core of the 
language.

If the aim is "everything you could possibly want in a decimal type", it 
certainly can't be an extension; if the aim is "better support for decimals", 
then it possibly can.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type

2024-04-28 Thread Rowan Tommins [IMSoP]

On 28 April 2024 07:02:22 BST, Alexander Pravdin  wrote:
>Hello everyone. To continue the discussion, I'm suggesting an updated
>version of my proposal.

This all sounds very useful ... but it also sounds like several months of 
full-time expert development.

Before you begin, I think it will be really important to define clearly what 
use cases you are trying to cater for, and who your audience is. Only then can 
you define a minimum set of requirements and goals.

It seems to me that the starting point would be an extension with a decimal 
type as an object, and implementations for all the operations you want to 
support. You'll probably want to define that more clearly than "anything in the 
language which takes a float".

What might seem like it would be the next step is converting the object to a 
"native type", by adding a new case to the zval struct. Not only would this 
require a large amount of work to start with, it would have an ongoing impact 
on everyone working with the internals.

I think a lot of the benefits could actually be delivered without it, and as 
separate projects: 

- Optimising the memory performance of the type, using copy-on-write semantics 
rather than eager cloning. See Gina's recent thread about "data classes".

- Overloading existing functions which accept floats with decimal 
implementations. Could potentially be done in a similar way to operator 
overloads and special interfaces like Countable.

- Convenient syntax for creating decimal values, such as 0.2d, 
declare(default_decimal), or having (decimal) casts affecting the tree of 
operations below them rather than just the result. This just needs the type to 
be available to the compiler, not a new zval type - for instance, anonymous 
function syntax creates a Closure object. 

There may be other parts I've not mentioned, but hopefully this illustrates the 
idea that "a native decimal type" doesn't have to be one all-or-nothing project.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC][Vote] Property Hooks

2024-04-27 Thread Rowan Tommins [IMSoP]


On 15/04/2024 17:43, Larry Garfield wrote:

The vote for the Property Hooks RFC is now open:

https://wiki.php.net/rfc/property-hooks

Voting will close on Monday 29 April, afternoonish Chicago time.



I'm somewhat conflicted on this one. On the one hand, I think the 
feature will be very powerful, and it's clear a lot of effort has been 
put into designing something that fits with the language. On the other 
hand, however, I share the concerns some have expressed that it is a 
very complex proposal.


I would have more enthusiastically supported one which left out a few 
"bells and whistles", or moved them to Future Scope to be polished and 
agreed separately. I hope I was consistent in expressing that during the 
discussion phase.


For that reason, please consider this an "abstention": while I'm 
reasonably happy for the RFC as written to pass, I am not going to cast 
a Yes vote myself.


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC] [Discussion] #[\Deprecated] attribute again v1.3

2024-04-26 Thread Rowan Tommins [IMSoP]

On 26 April 2024 09:40:57 BST, Mike Schinkel  wrote:

>Given a lack of agreed definition for 'since' it appears you are using narrow 
>assumptions about the meaning of 'since' that led you to view 'since' as 
>useless.

I can't see any ambiguity in the definition: "This function has been deprecated 
since version 7.2" seems a straightforward English sentence, meaning that 
before 7.2 it wasn't deprecated, and from that version onward it is.

If there's some alternative reading of it, it's not that I'm assuming it 
doesn't apply, it's that I'm completely unaware of what it might be.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC] [Discussion] #[\Deprecated] attribute again v1.3

2024-04-25 Thread Rowan Tommins [IMSoP]

On 25 April 2024 22:01:35 BST, Mike Schinkel  wrote:
>> On Apr 25, 2024, at 11:28 AM, Rowan Tommins [IMSoP]  
>> wrote:
>> If the project has no clear deprecation policy, the information is useless 
>> anyway.
>
>Not true.
>
>Having standardized notation for deprecation would allow tooling to analyze a 
>codebase and determine if it contains deprecated code that needs to be 
>remediated without having to run the code with full coverage.  

I think you missed the context of that sentence - or I'm missing something in 
yours. I meant specifically that the "deprecated since" information is useless 
if there's no published policy on how long something will stay deprecated. 

I think the "deprecated" attribute itself is definitely useful.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC] [Discussion] #[\Deprecated] attribute again v1.3

2024-04-25 Thread Rowan Tommins [IMSoP]


On 25/04/2024 08:40, Stephen Reay wrote:
If you're on X.y and it says it was deprecated in X.w you know you 
don't need to worry about it being removed until at least Y.a.



Yeah, that's the reasoning given in the Rust discussion, but I don't 
find it convincing.


If the project's deprecation policy is that deprecations will be removed 
in the next major version, the information is redundant: if you get the 
deprecation message in 2.x, you know it will be removed in 3.0


If the project has some other deprecation policy, like "after 1 full 
major version cycle", then you can work out that "since: 2.3" means 
removal in 4.0; but the person adding the attribute also knows that, and 
could save the reader some effort by writing "planned removal: 4.0"


If the project has no clear deprecation policy, the information is 
useless anyway.



If you wanted it to be clearer I'd suggest maybe rename "since" to 
"version", but that's more to give a hint at intended use than anything.



I don't think there's anything *unclear* about "since", I just don't 
think it's very *useful*. But apparently it's common to write it, so I 
guess I'm in the minority.


Naming it "version" would just make it less clear, and not resolve 
anything from my point of view.



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC] [Discussion] #[\Deprecated] attribute again v1.3

2024-04-25 Thread Rowan Tommins [IMSoP]

On 24 April 2024 18:18:28 BST, Jorg Sowa  wrote:
> What about setting this parameter vaguely as the boolean we can pass?
> ...
> #[Deprecated(since: $packageVersion > 5.5)]
> #[Deprecated(since: PHP_VERSION_ID > 80100)]
> #[Deprecated(since: date("Y-m-d") > "2024-01-21")]

Even if these expressions were legal, as far as I know, standard reflection 
doesn't give any access to the source code or AST of how the attribute was 
written, so this would just end up with a meaningless "$since = true", and some 
source code that might as well be a comment. 

To be honest, I'm not really sure what I'd do with the information in a "since" 
field even if it was there. If you were running PHP 7.4, what difference would 
it make to know that create_function was deprecated in 7.2, rather than in 7.1 
or 7.3? The two relevant facts are when the suggested replacement was 
introduced (in case you need to support multiple versions); and what is the 
soonest that the deprecated feature will be removed. The second in particular 
is something I would like every deprecation message to include, rather than the 
vague "may be removed in a future version".

I found this discussion of "since" in Rust's implementation, but don't find the 
arguments in favour particularly compelling: 
https://github.com/rust-lang/rfcs/pull/1270#issuecomment-138043714

Of interest, that discussion also linked to a related feature in Java, which 
could perhaps be added to a list in the RFC alongside the Rust and JetBrains 
ones already mentioned: https://openjdk.org/jeps/277

It's interesting to note, for instance, that both Java and Rust designers 
considered a specific "replacement" field, but decided that it was unlikely to 
be useful in practice. The Java proposal states this nicely: 

> In practice, there is never a drop-in replacement API for
> any deprecated API; there are always tradeoffs and
> design considerations, or choices to be made among
> several possible replacements. All of these topics require
> discussion and are thus better suited for textual
> documentation.

The JetBrains attribute *does* include a "replacement" argument, but it's 
heavily tied into a specific use case: it contains a template used for code 
transformation in the IDE. Both it and "since" are explicitly marked 
"applicable only for PhpStorm stubs".

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC][Vote announcement] Property hooks

2024-04-12 Thread Rowan Tommins [IMSoP]




On 10 April 2024 04:40:13 BST, Juliette Reinders Folmer 
 wrote:

* Whether a type can be specified on the parameter on `set` depends on whether 
the property is typed. You cannot declare `set(mixed $value)` for an untyped 
property, even though it would effectively be compatible. This is inconsistent 
with the behaviour for, for instance method overloads, where this is 
acceptable: https://3v4l.org/hbCor/rfc#vrfc.property-hooks , though it is 
consistent with the behaviour of property overloads, where this is not 
acceptable: https://3v4l.org/seDWM (anyone up for an RFC to fix this 
inconsistency ?)



Just picking up on this point, because it's a bit of a tangle: PHP currently 
makes a hard distinction between "typed properties" and "untyped properties". 
For instance, unset() works differently, and the "readonly" attribute can only 
be added to a typed property.

That's actually rather relevant to your point, because if this RFC passes we 
would probably need to consider that PHP has at least 4 types of properties: 

- dynamic properties (deprecated by default, but allowed with an attribute)
- declared but untyped properties
- typed properties
- virtual properties

But maybe 6, with: 

- untyped properties with hooks
- typed properties with hooks

Of course, most of the time, users aren't aware of the current 3-way split, and 
they won't need to think about all 6 of these variations. But there are going 
to be cases where documentation or a future RFC has to cover edge cases of each.

I do think there is scope for removing some features from the RFC which are 
nice but not essential, and reducing these combinations. For instance, if we 
limit the access to the underlying property, we might be able to treat "virtual 
properties" as just an optimisation: the engine doesn't allocate a property it 
knows will never be accessed, and accesses to it, e.g. via reflection, just 
return "uninitialized".

I am however conscious that RFCs have failed in the past for being "not 
complete enough" as well as for being "too complex".

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-10 Thread Rowan Tommins [IMSoP]

On 10 April 2024 10:38:44 BST, Saki Takamachi  wrote:

>I was thinking about this today, and I think both are correct opinions on 
>whether to set the initial value to HALF_UP or TOWARD_ZERO. It's just a matter 
>of prioritizing whether consistency with existing behavior or consistency 
>within a class, and they can never be met simultaneously.

Yes, I agree there's a dilemma there.

The extra point in favour of TOWARD_ZERO is that it's more efficient, because 
we don't have to over-calculate and round, just pass scale directly to the 
implementation. Any other option makes for unnecessary extra calculation in 
code like this:

$total = new Number('20');
$raw_frac = $total / 7;
$rounded_frac = $raw_frac->round(2, Round::HALF_UP);

If HALF_UP rounding is the implied default, we have to calculate with scale 11 
giving 1.42857142857, round to 1.4285714286, then round again to 1.43.

If truncation / TOWARD_ZERO is the implied default, we only calculate with 
scale 10 giving 1.4285714285 and then round once to 1.43.

(Of course, in this example, the most efficient would be for the user to write 
$rounded_frac = $total->div(7, 2, Round::HALF_UP) but they might have reasons 
to keep the division and rounding separate.)

Regards,
-- 
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-10 Thread Rowan Tommins [IMSoP]

On 10 April 2024 00:36:21 BST, Saki Takamachi  wrote:
>- The scale and rounding mode are not required for example in add, since the 
>scale of the result will never be infinite and we can automatically calculate 
>the scale needed to fit the result. Does adding those two options to all 
>calculations mean adding them to calculations like add as well?

That's why I mentioned the two different groups of users. The scale and 
rounding mode aren't there for group (a), who just want the scale to be managed 
automatically; they are there for group (b), who want to guarantee a particular 
result has a particular scale. The result of $a->add($b, 2, Round::HALF_UP) 
will always be the same as $a->add($b)->round(Round::HALF_UP)  but is more 
convenient, and in some cases more efficient, since it doesn't calculate 
unnecessary digits. 

Remember also the title and original aim of the RFC: add object support to 
BCMath. The scale parameter is already there on the existing functions (bcadd, 
bcmul, etc), so removing it on the object version would be surprising. The 
rounding mode is a new feature, but there doesn't seem a good reason not to 
include it everywhere as well.

>- As Tim mentioned, it may be confusing to have an initial value separate from 
>the mode of the `round()` method. Would it make sense to have an initial value 
>of HALF_UP?

Again, the aim was to match the functionality of the existing functions. It's 
likely that users will migrate code written using bcdiv() to use 
BCMath\Number->div() and expect it to work the same, at least when specifying a 
scale. Having it behave differently by rounding up the last digit by default 
seems like a bad idea. 

Thinking about the implementation, the truncation behaviour also makes sense: 
the library isn't actually rounding anything, it's calculating digit by digit, 
and stopping when it reaches the requested scale.

The whole concept of rounding is something that we are adding, presumably by 
passing $scale+1 to the underlying library functions. It's a nice feature to 
add, but not one that should be on by default, given we're not writing the 
extension from scratch.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-09 Thread Rowan Tommins [IMSoP]


On 24/03/2024 13:13, Saki Takamachi wrote:

https://wiki.php.net/rfc/support_object_type_in_bcmath



Based on the various discussions we've been having, I'd like to propose 
a simplified handling of "scale".


I think there are two groups of users we are trying to help:

a) Users who want an "infinite" scale, and will round manually when 
absolutely necessary, e.g. for display. The scale can't actually be 
infinite in the case of calculations like 1/3, so they need some safe 
cut-off.


b) Users who want to perform operations on a fixed scale, with 
configurable rounding, e.g. for e-commerce pricing. They are not 
interested in any larger scale, except possibly in some intermediate 
calculations, when they want the same as group (a).


I propose:

- The constructor accepts string|int $num only.

- All operations accept an optional scale and rounding mode.

- If no rounding mode is provided, the default behaviour is to truncate. 
This means that (new BCMath\Number('20'))->div(3, 5) has the same result 
as bcdiv('20', '3', 5) which is 6.6


- If a rounding mode is provided, the object transparently calculates 
one extra digit of scale, then rounds according to the specified mode.


- If no scale is provided, most operations will automatically calculate 
the required scale, e.g. add will use the larger of the two scales. This 
is the same as the current RFC.


- If no scale is provided to div(), sqrt(), or pow(-$x), the result will 
be calculated to the scale of the left-hand operand, plus 10. This is 
the default behaviour in the current RFC.


- Operator overloads behave the same as not specifying a scale or 
rounding mode to the corresponding method. Therefore (new 
BCMath\Number('20')) / (new BCMath\Number('3')) will result in 
6.66 - an automatic scale of 10, and truncation of further digits.


Compared to the current RFC, that means:

- Remove the ability to customise "max expansion scale". For most users, 
this is a technical detail which is more confusing than useful. Users in 
group (b) will never encounter it, because they will specify scale 
manually; advanced users in group (a) may want to customise the logic in 
different ways anyway.


- Remove the ability for a Number value to carry around its own default 
rounding mode. Users in group (a) will never use it. Users in group (b) 
are likely to want the same rounding in the whole application, but 
providing it on every call to new Number() is no easier than providing 
it on each fixed-scale calculation.


- Remove the $maxExpansionScale and $roundingMode properties and 
constructor parameters.


- Remove withMaxExpansionScale and withRoundMode.

- Remove all the logic around propagating rounding mode and expansion 
scale between objects.


I've also noticed that the round method is currently defined as:

- public function round(int $precision = 0, int $mode = 
PHP_ROUND_HALF_UP): Number {}


Presumably $precision here is actually the desired scale of the result? 
If so, it should probably be named $scale, as in the rest of the interface.


I realise it's called $precision in the global round() function; that's 
presumably a mistake which is now hard to fix due to named parameters.


Ideally, it would be nice to have both roundToPrecision() and 
roundToScale(), but as Jordan explained, an implementation which 
actually calculated precision could be difficult and slow.


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?

2024-04-09 Thread Rowan Tommins [IMSoP]

On 8 April 2024 21:51:46 BST, Jordan LeDoux  wrote:
>I have mentioned before that my understanding of the deeper aspects of how
>zvals work is very lacking compared to some others, so this is very
>helpful.

My own knowledge definitely has gaps and errors, and comes mostly from 
introductions like https://www.phpinternalsbook.com/ and in this case Nikita's 
blog articles about the changes in 7.0: 
https://www.npopov.com/2015/05/05/Internal-value-representation-in-PHP-7-part-1.html

> I confess that I do not
>understand the technical intricacies of the interned strings and packed
>arrays, I just understand that the zval structure for these arbitrary
>precision values would probably be non-trivial, and from what I was able to
>research and determine that was in part related to the 64bit zval limit.

From previous discussions, I gather that the hardest part of implementing a new 
zval type is probably not the memory structure itself - that will mostly be 
handled in a few key functions and macros - but the sheer number of places that 
do something different with each zval type and will need updating. Searching 
for Z_TYPE_P, which is just one of the macros used for that purpose, shows over 
200 lines to check: 
https://heap.space/search?project=php-src=Z_TYPE_P=c

That's why it's so much easier to wrap a new type in an object, because then 
all of those code paths are considered for you, you just have a fixed set of 
handlers to implement. If Ilija's "data classes" proposal progresses, you'll be 
able to have copy-on-write for free as well.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?

2024-04-08 Thread Rowan Tommins [IMSoP]

, even making it into the 
"bundled" list doesn't mean it's installed by default everywhere, and 
userland libraries spend a lot of effort polyfilling things which would 
ideally be available by default.



This is, essentially, the thesis of the research and work that I have 
done in the space since joining the internals mailing list.



Thanks, there's some really useful perspective there.

Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?

2024-04-08 Thread Rowan Tommins [IMSoP]

On Mon, 8 Apr 2024, at 13:42, Arvids Godjuks wrote:
> The ini setting I was considering would function similarly to what it does 
> for floats right now - I assume it changes the exponent, thereby increasing 
> their precision but reducing the integer range they can cover.

If you're thinking of the "precision" setting, it doesn't do anything nearly 
that clever; it's purely about how many decimal digits should be *displayed* 
when converting a binary float value to a decimal string. In recent versions og 
PHP, it has a "-1" setting that automatically does the right thing in most 
cases. https://www.php.net/manual/en/ini.core.php#ini.precision

The other way around - parsing a string to a float, including when compiling 
source code - has a lot of different compile-time options, presumably to 
optimise on different platforms; but no user options at all: 
https://github.com/php/php-src/blob/master/Zend/zend_strtod.c

Regards,
-- 
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type

2024-04-08 Thread Rowan Tommins [IMSoP]

On 8 April 2024 10:12:31 BST, Saki Takamachi  wrote:
>
>I don't see any point in "scalar types" that feel almost like objects, because 
>it just feels like you're manipulating objects with procedural functions. Why 
>not just use objects instead?

Again, I don't think "has more than one attribute" is the same as "feel almost 
like objects". But we're just getting further away from the current discussion, 
I think.

>Sorry, but I have no idea what you mean by "numbers have rounding modes". 
>Numbers are just numbers, and if there's something other than numbers in 
>there, then to me it's an object.

The proposed class is called BCMath\Number, which implies that every instance 
of that class represents a number, just as every instance of a class called 
DateTime represents a date and time.

In the end, a class is just a type definition. In pure OOP, it defines the type 
by its behaviour (methods / messages); in practice, it also defines the 
properties that each value of the type needs.

So I am saying that if you were designing a class to represent numbers, you 
would start by saying "what properties does every number value have?" I don't 
think "rounding mode" would be on that list, so I don't think it belongs on a 
class called Number.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type

2024-04-08 Thread Rowan Tommins [IMSoP]

On 8 April 2024 01:34:45 BST, Saki Takamachi  wrote:
>
> I'm making these opinions from an API design perspective. How the data is 
> held internally is irrelevant. zval has a lot of data other than values. What 
> I want to say is that if multiple types of parameters are required for 
> initialization, they may only serve as a substitute for object for the user.

Again, that only seems related to objects because that's what you're used to in 
PHP, and even then you're overlooking an obvious exception: array(1, 2)

If we ever do want to make decimals a native type, we would need some way to 
initialise a decimal value, since 1.2 will initialise a float. One of the most 
obvious options is a function-like syntax, decimal(1.2). If we do want numbers 
to carry extra information in each value, it will be no problem at all to 
support that.

On the other side, just because something's easy doesn't mean it's the right 
solution. We could make an object which contained a number and an operation, 
and write this: 

$a = new NumberOp(42, 'add');
$b = $a->exec(15);
$c = $b->withOperation('mul');
$d = $c->exec(2);

I'm sure you'd agree that would be a bad design.

So, again, I urge you to forget about it being easy to stick an extra property 
on an object, and think in the abstract: does it make sense to say "this number 
has a preferred rounding mode", rather than "this operation has a preferred 
rounding mode".

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?

2024-04-07 Thread Rowan Tommins [IMSoP]


On 07/04/2024 20:55, Jordan LeDoux wrote:

I have been doing small bits of work, research, and investigation into 
an MPDec or MPFR implementation for years, and I'm likely to continue 
doing my research on that regardless of whatever is discussed in this 
thread.



I absolutely encourage you to do that. What I'm hoping is that you can 
share some of what you already know now, so that while we're discussing 
BCMath\Number, we can think ahead a bit to what other similar APIs we 
might build in the future. The below seems to be exactly that.




Yes. BCMath uses fixed-scale, all the other libraries use 
fixed-precision. That is, the other libraries use a fixed number of 
significant digits, while BCMath uses a fixed number of digits after 
the decimal point.



That seems like a significant difference indeed, and one that is 
potentially far more important than whether we build an OO wrapper or a 
"scalar" one.



So, for instance, it would not actually be possible without manual 
rounding in the PHP implementation to force exactly 2 decimal digits 
of accuracy in the result and no more with MPDec.



The current BCMath proposal is to mostly choose the scale calculations 
automatically, and to give precise control of rounding. Neither of those 
are implemented in libbcmath, which requires an explicit scale, and 
simply truncates the result at that point.


That's why I said that the proposal isn't really about "an OO wrapper 
for BCMath" any more, it's a fairly generic Number API, with libbcmath 
as the back-end which we currently have available. So thinking about 
what other back-ends we might build with the same or similar wrappers is 
useful and relevant.



The idea of money, for instance, wanting exactly two digits would 
require the implementation to round, because something like 0.0013 
has two digits of *precision*, which is what MPDec uses, but it has 8 
digits of scale which is what BCMath uses.



This brings us back to what the use cases are we're trying to cover with 
these wrappers.


The example of fixed-scale money is not just a small niche that I happen 
to know about: brick/money has 16k stars on GitHub, and 18 million 
installs on Packagist; moneyphp/money has 4.5k stars and 45 million 
installs; one has implementations based on plain PHP, GMP, and BCMath; 
the other has a hard dependency on BCMath.


Presumably, there are other use cases where working with precision 
rather than scale is essential, maybe just as popular (or that could be 
just as popular, if they could be implemented better).


In which case, should we be designing a NumberInterface that provides 
both, with BCMath having a custom (and maybe slow) implementation for 
round-to-precision, and MPDec/MPFR having a custom (and maybe slow) 
implementation for round-to-scale?


Or, should we abandon the idea of having one preferred number-handling 
API (whether that's NumberInterface or a core decimal type), because no 
implementation could handle both use cases?



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type

2024-04-07 Thread Rowan Tommins [IMSoP]


On 07/04/2024 14:27, Saki Takamachi wrote:
If we really wanted decimal to be a native type, then the rounding 
mode and scale behavior should be completely fixed and not user 
selectable. If not, decimal should be implemented as a class.



As I replied to Jordan, I don't see why this is connected to "scalar" vs 
"object" types at all. An object - particularly an immutable one - is 
just a way of declaring a type, and some syntax for operations on that 
type. There's really no difference at all between these:


$half = $whole / 2;
$half = numeric_div($whole, 2);
$half = $whole->div(2);

In PHP, right now, the last one is only available on objects, but there 
have been proposals in the past to change that; it's just syntax.


For rounding, the first one is the real problem, because there's nowhere 
to put an extra operand. That problem is the same for a class with an 
overloaded "/" operator, and a "scalar" type which has a definition of 
"/" in the engine.


Maybe it feels more obvious that an object can carry extra state in 
private properties, but internal types don't actually need private 
properties at all. PHP's "array" type has a bunch of different state 
without being an object (a linked list of items, a hashtable for random 
access, an iteration pointer, etc); and SimpleXMLElement and DOMNode are 
exposed in PHP as separate classes, but actually store state in the same 
C struct provided by libxml2.


So I see it just as an API design decision: do we specify the rounding 
mode of numeric division a) on every operation; b) on every value; c) in 
a runtime setting (ini_set); d) in a lexically scoped setting (declare)?


My vote is for (a), maybe with (d) as a fallback.

Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-07 Thread Rowan Tommins [IMSoP]


On 07/04/2024 18:09, Tim Düsterhus wrote:
- I'm not sure if the priority for the rounding modes is sound. My gut 
feeling is that operations on numbers with different rounding modes 
should be disallowed or made explicit during the operation (much like 
the scale for a division), but I'm not an expert in designing numeric 
APIs, so I might be wrong here. 



Personally, I'm not a fan of setting the rounding mode and the "max 
expansion scale" on each instance, for the same reason I'm not keen on 
having the collation on each instance in Derick's Unicode string draft.


I understand the temptation: specifying it for every operation makes 
code more verbose, particularly since it rules out use of $a / $b; while 
specifying it as a global or scoped option would make code harder to 
reason about.


But I think carrying it around on the instance doesn't really solve 
either problem, and creates several new ones:


- A program which wants all operations to use the same rounding system 
still has to specify the options every time it initialises a value, 
which is probably nearly as often as operating on them.


- A program which wants different modes at different times will end up 
calling $foo->withRoundingMode(RoundingMode::HALF_UP)->div(2), which is 
more verbose and probably slower than $foo->div(2, RoundingMode::HALF_UP)


- You can't look at a function accepting a Number as input and know what 
rounding mode it will operate in, unless it explicitly changes it. It 
would be easier to scan up to find a per-file / per-block declare() 
directive, than to trace the calling code to know the rounding mode of 
an instance.


- A complex set of rules is invented to "prioritise" the options in 
operations like $a + $b. Or, that operation has to be forbidden unless 
the mode is consistent, at which point it might as well be a global setting.



As a thought experiment for comparison, imagine if to sort an array 
numerically you had to write this:


$array = array_set_sorting_mode($array, SORT_NUMERIC);
$array = array_sort($array);

Or worse, if you had to set it on each string:

$array = array_map($array, fn($s) => $s->withSortingMode(SORT_NUMERIC));
$array = array_sort($array);

Rather than (assuming we replaced the current by-reference sorts):

$array = array_sort($array, SORT_NUMERIC);

Because we're designing an object, attaching extra properties to it is 
easy, but I don't think it actually makes it easy to use.



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?

2024-04-07 Thread Rowan Tommins [IMSoP]

On 7 April 2024 15:38:04 BST, Saki Takamachi  wrote:
>> In other words, looking at how the efforts overlap doesn't have to mean 
>> abandoning one of them, it can mean finding how one can benefit the other.
>
>I agree that the essence of the debate is as you say.
>However, an argument must always reach a conclusion based on its purpose, and 
>combining two arguments with different purposes can make it unclear how to 
>reach a conclusion.

Well, that's the original question: are they actually different purposes, from 
the point of view of a user?

I just gave a concrete suggestion, which didn't involve "combining two 
arguments", it involved splitting them up into three projects which all 
complement each other. 

It feels like both you and Jordan feel the need to defend the work you've put 
in so far, which is a shame; as a neutral party, I want to benefit from *both* 
of your efforts. It really doesn't matter to me how many mailing list threads 
that requires, as long as there aren't two teams making conflicting designs for 
the same feature.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?

2024-04-07 Thread Rowan Tommins [IMSoP]

On 7 April 2024 11:44:22 BST, Saki Takamachi  wrote:
>I don't think the two threads can be combined because they have different 
>goals. If one side of the argument was, "How about to add BCMath?" then 
>perhaps we should merge the discussion. But BCMath already exists and the 
>agenda is to add an OOP API.
>
>In other words, one is about adding new features, and the other is about 
>improving existing features.

While I appreciate that that was the original aim, a lot of the discussion at 
the moment isn't really about BCMath at all, it's about how to define a 
fixed-precision number type. For instance, how to specify precision and 
rounding for operations like division. I haven't seen anywhere in the 
discussion where the answer was "that's how it already works, and we're not 
adding new features".

Is there anything in the proposal which would actually be different if it was 
based on a different library, and if not, should we be designing a 
NumberInterface which multiple extensions could implement? Then Jordan's search 
for a library with better performance could lead to new extensions implementing 
that interface, even if they have portability or licensing problems that make 
them awkward to bundle in core.

Finally, there's the separate discussion about making a new "scalar type". As I 
said in a previous email, I'm not really sure what "scalar" means in this 
context, so maybe "integrating the type more directly into the language" is a 
better description? That includes memory/copying optimisation (potentially 
linked to Ilija's work on data classes), initialisation syntax (which could be 
a general feature), and accepting the type in existing functions (something 
frequently requested for custom array-like types).

In other words, looking at how the efforts overlap doesn't have to mean 
abandoning one of them, it can mean finding how one can benefit the other.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Native decimal scalar support and object types in BcMath - do we want both?

2024-04-07 Thread Rowan Tommins [IMSoP]

On 7 April 2024 01:32:29 BST, Jordan LeDoux  wrote:

>Internals is just volunteers. The people working on BCMath are doing that
>because they want to, the people working on scalar decimal stuff are doing
>that because they want to, and there's no project planning to tell one
>group to stop. That's not how internals works (to the extent it works).

I kind of disagree. You're absolutely right the detailed effort is almost 
always put in by people working on things that interest them, and I want to 
make clear up front that I'm extremely grateful to the amount of effort people 
do volunteer, given how few are paid to work on any of this.

However, the goal of the Internals community as a whole is to choose what 
changes to make to a language which is used by millions of people. That 
absolutely involves project planning, because there isn't a marketplace of PHP 
forks with different competing features, and once a feature is added it's very 
hard to remove it or change its design.

If - and I stress I'm not saying this is true - IF these two features have such 
an overlap that we would only want to release one, then we shouldn't just 
accept whichever is ready first, we should choose which is the better solution 
overall. And if that was the case, why would we wait for a polished 
implementation of both, then tell one group of volunteers that all their hard 
work had been a waste of time?

So I think the question is very valid: do these two features have distinct use 
cases, such that even if we had one, we would still want to spend time on the 
other? Or, should we decide a strategy for both groups to work together towards 
a single goal?

That's not about "telling one group to stop", it's about working together for 
the benefit of both users and the people volunteering their effort, to whom I 
am extremely grateful.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-06 Thread Rowan Tommins [IMSoP]


On 06/04/2024 07:24, Saki Takamachi wrote:

Take a look at the methods shown below:
```
protected static function resultPropertyRules(string $propertyName,
mixed $value1, mixed $value2): mixed {}
```

This method determines which operand value to use in the result after
calculation for a property that the user has defined by extending the
class.



While this is an intriguing idea, it only solves a narrow set of use 
cases. For instance:


- The class might want different behaviour for different operations; 
e.g. Money(42, 'USD') + Money(42, 'USD') should give Money(84, 'USD'); 
but Money(42, 'USD') * Money(42, 'USD') should be an error.


- Properties might need to interact with each other; e.g. Distance(2, 
'metres') + Distance(2, 'feet') could result in Distance(2.6096, 
'metres'); but if you calculate one property at a time, you'll end up 
with Distance(4, 'metres'), which is clearly wrong.


The fundamental problem is that it ignores the OOP concept of 
encapsulation: how an object stores its internal state should not define 
its behaviour. Instead, the object should be able to directly define 
behaviour for the operations it supports.


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-06 Thread Rowan Tommins [IMSoP]

On 5 April 2024 19:30:24 BST, Jordan LeDoux  wrote:

>A composed class
>does not somehow prevent the accidental error of mixing currencies, it just
>moves where that error would occur

It does not prevent someone accidentally *attempting* to mix currencies, but it 
absolutely prevents that mistake leading to bogus values in the application, 
because the methods available can all detect it and throw an Error. 

If the class has a mixture of currency-aware methods, and methods / operator 
overloads inherited from Number, you can end up getting nonsensical results 
instead of an Error.

>If you want an actual answer about how a Money class would actually work in
>practice, it would likely be something like this:
>
>```
>// $val1 and $val2 are instances of the Money class with unknown currencies
>$val1Currency = $val1->getCurrency();
>$val2Currency = $val2->getCurrency();
>$val1 = $val1->convertToCommonCurrency();
>$val2 = $val2->convertToCommonCurrency();
>// perform the necessary calculations using operators
>$answer = $answer->convertToCurrency($userCurrency);
>```

You have missed out the key section: how do you actually add the two numbers? 
The addition MUST enforce the precondition that its operands are in the same 
currency; any other behaviour is nonsensical. So the definition of that method 
must be on the Money class, not inherited from Number. 

If the add() method on Number is final, you'll need to define a new method 
$val1->addCurrency($val2). The existing add() method and operator overload will 
be inherited unchanged, but calling them won't just be useless, it will be 
dangerous, because they can give nonsensical results. 

That's what makes composition the better design in this case, because the Money 
class can expose only the methods that actually have useful and safe behaviour.

The fact that the composed class can't add its own operator overloads is 
unfortunate; but allowing inheritance wouldn't solve that, because the 
inherited overloads are all wrong anyway.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type

2024-04-05 Thread Rowan Tommins [IMSoP]


On 04/04/2024 23:31, Jordan LeDoux wrote:


Well, firstly most of the financial applications that I've worked in 
(I work for a firm that writes accounting software right now) do not 
calculate intermediate steps like this with fixed precision, or even 
have an interest in doing so.



My background is in e-commerce (specifically, travel) rather than 
finance. In that context, it's common to have a single-step operation 
like "per_person_price equals total_price divided by number of 
passengers" where both per_person_price and total_price are going to be 
expressed with the same accuracy.


The top two results for "money" on Packagist are 
https://www.moneyphp.org/ and https://github.com/brick/money both of 
which take this general model: the scale of values is fixed, and every 
operation that might produce fractions of that requires a rounding mode.



Truly "fixed-precision" is not something that decimal should even try 
to be, in my opinion. The use cases where you CANNOT simply round the 
result at the end to fit your display output or your storage location 
are very minimal.



In that case, why do we need to think about the scale or precision of a 
decimal at all? What would the syntax 1.234_d3 do differently from 1.234_d?



I mean, what you are describing is how such OBJECTS are designed in 
other languages like Python, but not scalars.



I don't see any connection at all between what I'm describing and 
objects. I'm talking about what operations make sense on a particular 
data type, regardless of how that data type is implemented.


To be honest, I'm not really sure what "scalar" means in this context. 
In PHP, we call strings "scalars" because they're neither "arrays" nor 
"objects"; but none of those have definitions which are universal to 
other languages.



This kind of precision restriction isn't something you would place on 
an individual value, it's something you would place on all 
calculations. That's why in Python this is done with a global runtime 
setting using `getContext().perc` and `getContext().rounding`. A 
particular value having a precision of X doesn't imply anything 
concrete about a calculation that uses that value necessarily.



Global settings avoid needing extra parameters to each operation, but 
don't really work for the use case I'm describing: different currencies 
have different "natural" scale, e.g. Japanese Yen have a scale of 0 (no 
fractional Yen), Bitcoin has a scale of 8 (100 million satoshis in 1 
bitcoin).  A program dealing with multiple currencies will want to 
assign a different scale to different values.



Maybe we're just not understanding each other. Are you opposed to the 
idea of doing this as a scalar?



Not at all; my first examples used method syntax, because I was basing 
them on https://github.com/brick/money In my last e-mail, I also gave 
examples using normal function syntax.


What I'm saying is that $x / 2 doesn't have a good answer if $x is a 
fixed-precision number which can't be divided by 2 without exceeding 
that precision. You need a third operand, the rounding mode, so you 
can't write it as a binary operator, and need some kind of function like 
decimal_div(decimal $dividend, int|decimal $divisor, RoundingMode 
$roundingMode). How you implement "decimal" doesn't change that at all.



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-04 Thread Rowan Tommins [IMSoP]

On 03/04/2024 00:01, Ilija Tovilo wrote:
Data classes are classes with a single additional  > zend_class_entry.ce_flags flag. So unless customized, they behave as 
> classes. This way, we have the option to tweak any behavior we would 
> like, but we don't need to. > > Of course, this will still require an 
analysis of what behavior we > might want to tweak.

Regardless of the implementation, there are a lot of interactions we 
will want to consider; and we will have to keep considering new ones as 
we add to the language. For instance, the Property Hooks RFC would 
probably have needed a section on "Interaction with Data Classes".

On the other hand, maybe having two types of objects to consider each 
time is better than having to consider combinations of lots of small 
features.

On a practical note, a few things I've already thought of to consider:

- Can a data class have readonly properties (or be marked "readonly data 
class")? If so, how will they behave?
- Can you explicitly use the "clone" keyword with an instance of a data 
class? Does it make any difference?

- Tied into that: can you implement __clone(), and when will it be called?
- If you implement __set(), will copy-on-write be triggered before it's 
called?

- Can you implement __destruct()? Will it ever be called?

Consider this example, which would  > work with the current approach: > > 
$shapes[0]->position->zero!();

I find this concise example confusing, and I think there's a few things 
to unpack here...

Firstly, there's putting a data object in an array:

$numbers = [ new Number(42) ];
$cow = $numbers;
$cow[0]->increment!();
assert($numbers !== $cow);

This is fairly clearly equivalent to this:

$numbers = [ 42 ];
$cow = $numbers;
$cow[0]++;
assert($numbers !== $cow);

CoW is triggered on the array for both, because ++ and ->increment!() 
are both clearly modifications.

Second, there's putting a data object into another data object:

$shape = new Shape(new Position(42,42));
$cow = $shape;
$cow->position->zero!();
assert($shape !== $cow);

This is slightly less obvious, because it presumably depends on the 
definition of Shape. Assuming Position is a data class:

- If Shape is a normal class, changing the value of $cow->position just 
happens in place, and the assertion fails

- If Shape is a readonly class (or position is a readonly property on a 
normal class), changing the value of $cow->position shouldn't be 
allowed, so this will presumably give an error

- If Shape is a data class, changing the value of $shape->position 
implies a "mutation" of $shape itself, so we get a separation before 
anything is modified, and the assertion passes

Unlike in the array case, this behaviour can't be resolved until you 
know the run-time type of $shape.

Now, back to your example:

$shapes = [ new Shape(new Position(42,42)) ];
$cow = $shapes;
$shapes[0]->position->zero!(); assert($cow !== $shapes);

This combines the two, meaning that now we can't know whether to 
separate the array until we know (at run-time) whether Shape is a normal 
class or a data class.

But once that is known, the whole of "->position->zero!()" is a 
modification to $shapes[0], so we need to separate $shapes.

Without such a class-wide marker, you'll need to remember to add the
special syntax exactly where applicable.

$shapes![0]!->position!->zero();

The array access doesn't need any special marker, because there's no 
ambiguity. The ambiguous call is the reference to ->position: in your 
current proposal, this represents a modification *if Shape is a data 
class, and is itself being modified*. My suggestion (or really, thought 
experiment) was that it would represent a modification *if it has a ! in 
the call*.

So if Shape is a readonly class:

$shapes[0]->position->!zero();
// Error: attempting to modify readonly property Shape::$position

$shapes[0]->!position->!zero();
// OK; an optimised version of:
$shapes[0] = clone $shapes[0] with [
    'position' =>  (clone $shapes[0]->position with ['x'=>0,'y'=>0])
];

If ->! is only allowed if the RHS is either a readonly property or a 
mutating method, then this can be reasoned about statically: it will 
either error, or cause a CoW separation of $shapes. It also allows 
classes to mix aspects of "data class" and "normal class" behaviour, 
which might or might not be a good idea.

This is mostly just a thought experiment, but I am a bit concerned that 
code like this is going to be confusingly ambiguous:

$item->shape->position->zero!();

What is going to be CoW cloned, and what is going to be modified in 
place? I can't actually know without knowing the definition behind both 
$item and $item->shape. It might even vary depending on input.

Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type

2024-04-04 Thread Rowan Tommins [IMSoP]


On 04/04/2024 02:29, Jordan LeDoux wrote:
But when it comes to fixed-precision values, it should follow rules very 
similar to those we discussed in the BCMath thread:


- Addition and subtraction should return a value that is the largest 
scale/precision of any operands in the calculation.
- Division and multiplication should return a value that is the sum of 
the scale/precision of any operands + 2 or a default (perhaps 
configurable) value if the sum is small, to ensure that rounding occurs 
correctly. Near zero, floats have about 12-ish decimal digits of 
accuracy, and will return their full accuracy for example.



I haven't followed the discussion in the other thread, but I'm not sure 
what the use case would be for a "fixed scale decimal" that followed 
those rules.


As mentioned before, the use case I've encountered is money 
calculations, where what people want to fix is the smallest unit of 
account - e.g. €0.01 for practical currency, or €0.0001 for detailed 
accounting / trading.


If I write $total = 1.03_d2; $perPerson = $total / 2; I want a result of 
0.51_d2 or 0.52_d2 - that's why I specified a scale of 2 in the first place.


If I want an accurate result of 0.515_d3, I would just specify 1.03_d, 
since the scale hasn't had any effect on the result.


If I want a lossless split into [0.51_d2, 0.52_d2] I still need a 
function to exist somewhere, whether you spell that $total->split(2), or 
decimal_split($total, 2), etc. So it seems safer to also have 
$total->div(2, Round::DOWN) or decimal_div($total, 2, Round::DOWN) and 
have $total / 2 give an error.


Possibly, it could only error if the result doesn't fit in the scale, so 
that this would be fine: $total = 11.00_d2; $perPerson = $total / 2; 
assert($perPerson === 5.50_d2)


Or possibly, it would just be an error to perform division on a fixed 
scale decimal, but allowed on a variable-fixed scale decimal.


Regards,
--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Rowan Tommins [IMSoP]


On 02/04/2024 01:17, Ilija Tovilo wrote:

I'd like to introduce an idea I've played around with for a couple of
weeks: Data classes, sometimes called structs in other languages (e.g.
Swift and C#).



Hi Ilija,

I'm really interested to see how this develops. A couple of thoughts 
that immediately occurred to me...



I'm not sure if you've considered it already, but mutating methods 
should probably be constrained to be void (or maybe "mutating" could 
occupy the return type slot). Otherwise, someone is bound to write this:


$start = new Location('Here');
$end = $start->move!('There');

Expecting it to mean this:

$start = new Location('Here');
$end = $start;
$end->move!('There');

When it would actually mean this:

$start = new Location('Here');
$start->move!('There');
$end = $start;


I seem to remember when this was discussed before, the argument being 
made that separating value objects completely means you have to spend 
time deciding how they interact with every feature of the language.


Does the copy-on-write optimisation actually require the entire class to 
be special, or could it be triggered by a mutating method on any object? 
To allow direct modification of properties as well, we could move the 
call-site marker slightly to a ->! operator:


$foo->!mutate();
$foo->!bar = 42;

The first would be the same as your current version: it would perform a 
CoW reference separation / clone, then call the method, which would 
require a "mutating" marker. The second would essentially be an 
optimised version of $foo = clone $foo with [ 'bar' => 42 ]


During the method call or write operation, readonly properties would 
allow an additional write, as is the case in __clone and the "clone 
with" proposal. So a "pure" data object would simply be declared with 
the existing "readonly class" syntax.


The main drawback I can see (outside of the implementation, which I 
can't comment on) is that we couldn't overload the === operator to use 
value semantics. In exchange, a lot of decisions would simply be made 
for us: they would just be objects, with all the same behaviour around 
inheritance, serialization, and so on.



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Rowan Tommins [IMSoP]


On 02/04/2024 20:02, Ilija Tovilo wrote:

But, does it matter? I'm not sure we look at some commits closer than
others, based on its author. It's true that it might be easier to
identify malicious commits if they all come from the same user, but it
wouldn't prevent them.



It's like the difference between stealing someone's credit card, and 
cloning the card of everyone who comes into the shop: in the first case, 
someone needs to check their credit card statements carefully; in the 
second, you'll have a hard job even working out who to contact.


Similarly, if you discover a compromised key or signing account, you can 
look for uses of that key or account, which might be a tiny number from 
a non-core contributor; if you discover a compromised account pushing 
unsigned commits, you have to audit every commit in the repository.


I agree it's not a complete solution, but no security measure is; it's 
always about reducing the attack surface or limiting the damage.


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Rowan Tommins [IMSoP]


On 02/04/2024 18:27, Ilija Tovilo wrote:

If your GitHub account is compromised,
[...] the attacker may simply register their
own gpg key in your account, with the commits appearing as verified.

If your ssh key is compromised instead, and you use ssh to sign your
commits, the attacker may sign their malicious commits with that same
key they may use to push.



The key point (pun not intended) is that git doesn't record who pushed a 
commit - pushing is just data synchronization, not part of the history. 
What it records is who "authored" the commit, and by default that's just 
plain text; so if somebody compromises an SSH key or access token 
authorised to your GitHub account, they can push commits "authored by" 
Derick, or Nikita, or Bill Gates, and there is no way to tell them apart 
from the real thing.


In fact, you don't need to compromise anybody's key: you could socially 
engineer a situation where you have push access to the repository, or 
break the security in some other way. As I understand it, this is 
exactly what happened 3 years ago: someone gained direct write access to 
the git.php.net server, and added commits "authored by" Nikita and 
others to the history in the repository.


If all commits are signed, a compromised key or account can only be used 
to sign commits with that specific identity: your GitHub account can't 
be used to sign commits as Derick or Nikita, only as you. The impact is 
limited to one identity, not the integrity of the entire repository.


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Rowan Tommins [IMSoP]

On Tue, 2 Apr 2024, at 15:15, Derick Rethans wrote:
> Hi,
>
> What do y'all think about requiring GPG signed commits for the php-src 
> repository?

I actually thought this was already required since the github move (and the 
events that led to it) 3 years ago.

It was certainly discussed: https://externals.io/message/113838#113840 and a 
user guide was created on the PHP wiki: https://wiki.php.net/vcs/commit-signing

Feedback for the idea was generally positive, but maybe nobody got around to 
actually doing it.

-- 
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC] Invoke __callStatic when non-static public methods are called statically

2024-04-01 Thread Rowan Tommins [IMSoP]


On 29/03/2024 18:14, Robert Landers wrote:

When generating proxies for existing types, you often need to share
some state between the proxies. To do that, you put static
methods/properties on the proxy class and hope to the PHP Gods that
nobody will ever accidentally name something in their concrete class
with the name you chose for things. To help with that, you create some
kind of insane prefix.



Separating static and non-static methods wouldn't solve this - the 
concrete class could equally add a static method with the same name but 
a different signature, and your generated proxy would fail to compile.


In fact, exactly the same thing happens with instance methods in testing 
libraries: test doubles have a mixture of methods for configuring mock / 
spy behaviour, and methods mimicking or forwarding calls to the real 
interface / class. Those names could collide, and require awkward 
workarounds.


In a statically typed language, a concrete class can have two methods 
with the same name, but different static types, e.g. when explicitly 
implementing interfaces. In a "duck typing" system like PHP's, that's 
much trickier, because a call to $foo->bar() doesn't have a natural way 
to choose which "bar" is meant.




I'd much rather see static and non-static methods being able to
have the same name


Allowing this would lead to ambiguous calls, because as others have 
pointed out, :: doesn't always denote a static call. Consider this code:


class Test {
  public function test() { echo 'instance test'; }
  public static function test() { echo 'static test'; }
}

class Test2 extends Test {
  public function runTest() { parent::test(); }
}

(new Test2)->runTest();

Currently, this can call either of the test() methods if you comment the 
other out: https://3v4l.org/5HlPE https://3v4l.org/LBALm


If both are defined, which should it call? And if you wanted the other, 
how would you specify that? We would need some new syntax to remove the 
ambiguity.



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC] Invoke __callStatic when non-static public methods are called statically

2024-04-01 Thread Rowan Tommins [IMSoP]


On 29/03/2024 02:39, 하늘아부지 wrote:

I created a wiki for __callStatic related issues.
Please see:
https://wiki.php.net/rfc/complete_callstatc_magic



Hi,

Several times in the discussion you have said (in different words) 
"__callStatic is called for instance methods which are private or 
protected", but that is not how it is generally interpreted.


If you are calling a method from outside the class, as far as you're 
concerned only public methods exist; private methods are, by definition, 
hidden implementation details. This is more obvious in languages with 
static typing, where if you have an instance of some interface, only the 
methods on that interface exist; the concrete object might actually have 
other methods, but you can't access them.


That is what is meant by "inaccessible": __call and __callStatic are 
called for methods which, as seen from the current scope, *do not exist*.



You could still argue that static context is like a different scope, or 
a different statically typed interface - as far as that context is 
concerned, only static methods exist. But that's also not a common 
interpretation, for (at least) two reasons:


Firstly, there is no syntax in PHP which specifically marks a static 
call - Foo::bar() is used for both static calls, and for forwarding 
instance calls, most obviously in the case of parent::foo().


Secondly, until PHP 8, marking a method as static was optional; an error 
was only raised once you tried to access $this in a context where it 
wasn't defined. In PHP 4, this was correct code; in PHP 5 and 7, it 
raised diagnostics (first E_STRICT, later E_DEPRECATED) but still ran 
the method:


class Foo {
    function bar() {
    echo 'Hello, World!';
    }
}
Foo::bar();


I think that's part of the reason you're getting negative feedback: to 
you, the feature seems like an obvious extension, even a bug fix; but to 
others, it seems like a complete change to how static calls are interpreted.


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-31 Thread Rowan Tommins [IMSoP]


On 31/03/2024 14:53, Christian Schneider wrote:

But my main question is: I fail to see the difference whether I plant my
  malicious code in configure, configure.ac or *.c: Someone has to review
  the changes and notice the problem. And we have to trust the RMs. What
am I missing?



As I understand it, the attack being discussed involved*code that was never 
committed to version control*. The bulk of the payload was committed in fake 
binary test artifacts, which are unlikely to be inspected but harmless by 
themselves; but the trigger to incorporate it into the binary was 
added*manually*  in between the automated build and producing the signed 
release archive.

So the theory is that if there's no human involved in that process, there is no 
way for a human to introduce a malicious change at that step. An exploit would 
need to be introduced somewhere in version controlled, human-readable, code; 
giving extra chances for it to be detected.


On 30/03/2024 18:24, Jakub Zelenka wrote:
Do you think it would be different if the change happened in the 
distributed source file instead? I mean you could still modify tarball 
of the distributed file (e.g. hide somewhere in configure.ac or in our 
case more easily in less visible files like various Makefile.frag and 
similar). The only thing that you get by using just VCS files is that 
people could hash the distributed content of the files and compare it 
with the hash of the VCS files but does anyone do this sort of 
verification?



We already use a version control system built entirely on comparing hashes of source 
files. So given a signed tarball that claimed to match the content of a signed tag, any 
user can trivially check out the tag, expand the tarball, and run "git diff" to 
detect any anomalies.

The question of who would do that in practice is a valid one, and something 
that I'm sure has been discussed elsewhere regarding reproducible binary builds.



On 30/03/2024 15:35, Daniil Gentili wrote:
Btw, I do not believe that "it would require end users to install 
autotools and bison in order to compile PHP from tarballs" is valid 
reason to delay the patching of a serious attack vector ASAP.



As is always the case, there is a trade-off between security and 
convenience - in this case, distributing something that's usable without 
large amounts of extra tooling (including, for some generated files, a 
copy of PHP itself), vs distributing something that is 100% reviewable 
by humans.


Ultimately, 99.999% of users are not going to compile their own copy of 
PHP from source; they are going to trust some chain of providers to take 
the source, perform all the necessary build steps, and produce a binary. 
Removing generated files from the tarballs doesn't eliminate that need 
for trust, it just shifts more of it to organisations like Debian and 
RedHat; and maybe that's a valid aim, because those organisations have 
more resources than us to build appropriate processes.


Making things reproducible aims to attack the same problem from a 
different angle: rather than placing more trust in one part of the 
chain, it allows multiple parallel chains, which should all give the 
same result. If builds from different sources start showing unexplained 
differences, it can be flagged automatically.



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV][RFC] grapheme cluster for str_split, grapheme_str_split function

2024-03-27 Thread Rowan Tommins [IMSoP]

On 26/03/2024 21:14, Casper Langemeijer wrote:

If you need someone to help for the grapheme_ marketing team, let me know.

I think a big part of the problem is that very few people dig into the 
complexities of text encoding, and so don't know that a "grapheme" is 
what they're looking for.

Unicode documentation is, generally, very careful with its terminology - 
distinguishing between "code points", "code units" "graphemes" , 
"grapheme clusters", "glyphs", etc. Pretty much everyone else just says 
"character", and assumes that everyone knows what they mean.

As a case in point, looking at the PHP manual pages for strlen, 
mb_strlen, and grapheme_strlen:

Short summary:

- strlen — Get string length
- mb_strlen — Get string length
- grapheme_strlen — Get string length in grapheme units

Description:

- Returns the length of the given string.
- Gets the length of a string.
- Get string length in grapheme units (not bytes or characters)

The first two don't actually say what units they're measuring in. Maybe 
it's millimetres? ;)

The last one uses the term "grapheme" without explaining what it means, 
and makes a contrast with "characters", which is confusing, as one of 
the definitions in the Unicode glossary 
[https://unicode.org/glossary/#grapheme] is:

> What a user thinks of as a character.

The mb_strlen documentation has a bit more explanation in its Return 
Values section:

> Returns the number of characters in string string having character 
encoding encoding. A multi-byte character is counted as 1.

For Unicode in particular, this is a poor description; it is completely 
missing the term "code point", which is what it actually counts.

That's probably because ext/mbstring wasn't written with Unicode in 
mind, it was "developed to handle Japanese characters", back in 2001; 
and it still does support several pre-Unicode "multi-byte encodings". 
For a bit of nostalgia: 
http://web.archive.org/web/20010605075550/http://www.php.net/manual/en/ref.mbstring.php

So... if you want to help make people more aware of the grapheme_* 
functions, one place to start would be editing the documentation for the 
various string, mbstring, and grapheme functions to use consistent 
terminology, and sign-post each other more clearly. 
http://doc.php.net/tutorial/

Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: Make $offset of substr_replace null by default

2024-03-24 Thread Rowan Tommins [IMSoP]

[Aside: please don't use "reply" when starting a new thread. Although 
GMail and its imitators frequently ignore it, a reply contains a header 
telling clients where to add it to an existing thread. I've pasted your 
full text into a new e-mail rather than replying, so it reliably shows 
as its own thread.]



On 23/03/2024 22:58, mickmackusa wrote:

> substr_replace() has the following signature:
>
> substr_replace(
> array|string $string,
> array|string $replace,
> array|int $offset,
> array|int|null $length = null
> ): string|array
>
> Was it deliberate to not allow a null value as the third parameter?  
If permitted to amend this signature, I think it would be sensible to 
set null as the default value for $offset and adopt the same logic as 
the $length parameter.

>
> I have recently stumbled upon what I assume is a code smell in 
multiple SO scripts that use:

>
> $prefixed = preg_filter('/^/', 'prefix_', $array);
>
> It smells because regardless of the passed in array values' types, 
there will always be a starting position of each values which are 
coerced to strings. In other words, the destructive feature of 
preg_filter() is never utilized.

>
> This means that for this task, preg_filter() can be unconditionally 
replaced with preg_replace().

>
> $prefixed = preg_replace('/^/', 'prefix_', $array);
>
> But wait, regex isn't even needed for this task.  It can be coded 
more efficiently as:

>
> $prefixed = substr_replace($array, 'prefix_', 0, 0)
>
> Next, my mind shifted to suffixing/postfixing. By using $ in the pattern.
>
> $prefixed = preg_replace('/$/', 'prefix_', $array);
>
> However, there isn't a convenient way to append a string to each 
value using substr_replace() with the current signature.

>
> If the $offset parameter worked like the $length parameter, then the 
language would provide a native, non-regex tool for appending a static 
string to all array elements.

>
> $suffixed = substr_replace($array, '_suffix');
>
> Finally, I wish to flag the observation that null values inside of an 
array are happily coerced to strings inside of the aforementioned 
functions, but null is not consumable if singularly passed in.

>
> Some examples for context: https://3v4l.org/ENVip
>
> I look forward to hearing feedback/concerns.


Not being familiar with the variations supported by substr_replace, it 
took me a while to understand what was being proposed here.


In case anyone else is similarly lost, a null $length is equivalent to 
strlen($string), meaning "replace to the end"; so a null $offset having 
the same meaning would give "append to the end". On its own, this would 
be pretty pointless:


$foo = substr_replace('abc', 'xyz', null);
// a long-winded way of writing
$foo = 'abc' . 'xyz';

But the function also has built-in mapping over arrays, so it could be 
used to append the same string to multiple inputs:


$foo = substr_replace(['hello', 'goodbye'], '!', null);

Or append each entry from one list onto each entry in the other:

$foo = substr_replace(['one', 'two'], [' - uno', ' - dos'], null);

Demo: https://3v4l.org/6eEIG


While I can see the logic, it would never occur to me to use any of the 
functions mentioned for this task, rather than using array_map and a 
regular concatenation:


$foo = array_map(fn($string) => $string . '!', ['hello', 'goodbye']);

$foo = array_map(fn($string, $suffix) => $string . $suffix, ['one', 
'two'], [' - uno', ' - dos']);


Which of course extends to more complex cases:

$foo = array_map(fn($english, $spanish) => "'$english' en Español es 
'$spanish'", ['one', 'two'], ['uno', 'dos']);


$foo = array_map(fn($english, $spanish, $german) => "$english - $spanish 
- $german", ['one', 'two'], ['uno', 'dos'], ['ein', 'zwei']);


https://3v4l.org/d55kT


So, I'm not opposed to the change, but its value seems marginal.


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: AS assertions

2024-03-22 Thread Rowan Tommins [IMSoP]

On Fri, 22 Mar 2024, at 17:38, Claude Pache wrote:
> 
>> Le 22 mars 2024 à 16:18, Rowan Tommins [IMSoP]  a 
>> écrit :
>> 
>> $optionalExpiryDateTime = $expiry as ?DateTimeInterface else 
>> some_other_function($expiry);
>> assert($optionalExpiryDateTime is ?DateTimeInterface); // cannot fail, 
>> already asserted by the "as"
> 
> I think that the `is` operator is all we need; the `as` operator adds syntax 
> complexity for little gain. Compare:
> 
> $optionalExpiryDateTime = $expiry as ?DateTimeInterface else 
> some_other_function($expiry);
> 
> vs
> 
> $optionalExpiryDateTime = $expiry is ?DateTimeInterface ? $expiry : 
> some_other_function($expiry);


I agree, it doesn't add much; and that's what the draft RFC Ilija linked to 
says as well.

But the point of that particular example is that after the "is" version, you 
don't actually know the type of $optionalExpiryDateTime without looking up the 
return type of some_other_function()

With the "as" version, you can see at a glance that after that line, 
$optionalExpiryDateTime is *guaranteed* to be DateTimeInterface or null, which 
I understood to be the intention of Robert's original proposal on this thread.

-- 
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: AS assertions

2024-03-22 Thread Rowan Tommins [IMSoP]

On Fri, 22 Mar 2024, at 12:58, Robert Landers wrote:
> 
>> $optionalExpiryDateTime = $expiry as ?DateTimeInterface else new 
>> DateTimeImmutable($expiry);
> I'm not sure I can grok what this does...
>
> $optionalExpiryDateTime = ($expiry === null || $expiry instanceof
> DateTimeInterface) ? $expiry : new DateTimeImmutable($expiry)

Trying to write it as a one-liner is going to make for ugly code - that's why 
I'd love to have a new way to write it! But yes, that's the right logic.

With the "is" operator from the Pattern Matching draft, it would be:

$optionalExpiryDateTime = ($expiry is ?DateTimeInterface) ? $expiry : new 
DateTimeImmutable($expiry);

But with a clearer assertion that the variable will end up with the right type 
in all cases:

$optionalExpiryDateTime = $expiry as ?DateTimeInterface else 
some_other_function($expiry);
assert($optionalExpiryDateTime is ?DateTimeInterface); // cannot fail, already 
asserted by the "as"

> Maybe? What would be the usefulness of this in real life code? I've
> never written anything like it in my life.

I already explained the scenario: the parameter is optional, so you want to 
preserve nulls; but if it *is* present, you want to make sure it's the correct 
type before proceeding. Another example:

// some library function that only supports strings and nulls
function bar(?string $name) {
if ( $string !== null ) ...
else ...
}
// a function you're writing that supports various alternative formats
function foo(string|Stringable|int|null $name = null) {
// we don't want to do anything special with nulls here, just pass them 
along
// but we do want to convert other types to string, so that bar() doesn't 
reject them
bar($name as ?string else (string)$name);
}

To put it another way, it's no different from any other union type: at some 
point, you will probably want to handle the different types separately, but at 
this point in the program, either type is fine. In this case, the types that 
are fine are DateTimeInterface and null; or in the example above, string and 
null.

> $optionalExpiryDateTime = $expiry == null ? $expiry : $expiry as
> DateTimeInterface ?? new DateTimeImmutable($expiry as string ?? "now")

If you think that's "readable" then we might as well end the conversation here. 
If that was presented to me in a code review, I'd probably just write "WTF?!" 

I have no idea looking at that what type I can assume for 
$optionalExpiryDateTime after that line, which was surely the whole point of 
using "as" in the first place?

Regards,
-- 
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: AS assertions

2024-03-22 Thread Rowan Tommins [IMSoP]

On Fri, 22 Mar 2024, at 10:05, Robert Landers wrote:
> After asking an AI for some examples and usages, the most compatible
> one would be C#'s. In actuality, I think it could be hugely simplified
> if we simply return null instead of throwing. There'd be no special
> case for |null, and it would move the decision making to the
> programmer:
>
> $x = $a as int ?? throw new LogicException();

It might be relevant that C# has only recently introduced the concept of 
explicitly nullable reference types, with a complex migration process for 
existing code: 
https://learn.microsoft.com/en-us/dotnet/csharp/nullable-migration-strategies 
So in most C# code, there isn't actually a difference between "expect a 
DateTime" and "expect a DateTime or null"

PHP, however, strictly separates those two, and always has; so this would be 
surprising:

$x = $a as DateTime;
assert($x instanceof DateTime); // will fail if $x has defaulted to null!


That's why I suggested that with an explcit default, the default would be 
automatically asserted as matching the specified type:

$x = $a as DateTime else 'No date given'; // TypeError: string given, DateTime 
expected
$x = $a as DateTime|string else 'No date given'; // OK

$x = $a as DateTime else null; // TypeError: null given, DateTime expected
$x = $a as ?DateTime else null; // OK

If the statement runs without error, $x is guaranteed to be of the type (or 
pattern) given to the "as" operator.


Regards,
-- 
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: AS assertions

2024-03-22 Thread Rowan Tommins [IMSoP]

On Fri, 22 Mar 2024, at 08:17, Jordi Boggiano wrote:
> We perhaps could make sure that as does not throw if used with `??`, or that 
> `??` catches the type error and returns the right-hand expression instead:
> So to do a nullable typecast you would do:
> 
> $a as int|float ?? null
> 

While this limits the impact to only expressions combining as with ?? it still 
has the same fundamental problem: you can't meaningfully use it with a nullable 
type.

As a concrete example, imagine you have an optional $description parameter, and 
want to ensure any non-null values are converted to string, but keep null 
unchanged.

At first sight, it looks like you could write this:

$descString = $description as string|null ?? (string)$description;

But this won't work - the ?? swallows the null and turns it into an empty 
string, which isn't what you wanted. You need some syntax that catches the 
TypeError, but preserves the null:

$descString = $description as string|null else (string)$description;
// or
$descString = $description as string|null catch (string)$description;
// or
$descString = $description as string|null default (string)$description;

I actually think there are quite a lot of scenarios where that idiom would be 
useful:

$optionalExpiryDateTime = $expiry as ?DateTimeInterface else new 
DateTimeImmutable($expiry);
$optionalUnixTimestamp = $time as ?int else strotime((string)$time);
$optionalUnicodeName = $name as ?UnicodeString else new UnicodeString( $name );
etc

And once you have that, you don't need anything special for the null case, it's 
just:

$nameString = $name as ?string else null;

Regards,
-- 
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: AS assertions

2024-03-22 Thread Rowan Tommins [IMSoP]

On 22 March 2024 00:04:27 GMT, Robert Landers  wrote:

>I think that is where we are getting confused: `null` is a value (or
>at least, the absence of a value). The fact that the type system
>allows it to be used as though its a type (along with true and false)
>is interesting, but I think it is confusing the conversation.

Every value needs to belong to some type: for instance, true and false belong 
to the type "boolean", as returned by the gettype() function. There is a value 
called null, and the type it belongs to is also called "null". 

Unlike some languages, PHP has no concept of a typed null reference - you can't 
have "a null DateTime"; you can only have the one universal null, of type null.

The existence of "null" in type checks is therefore necessary if you want to 
allow every value to pass some type check. There isn't any other type that can 
include the value null because the type of null is always null.

That's completely different from true and false, both of which are covered by a 
type check for "bool". They are special cases, which aren't consistent with 
anything else in the type system. The "false" check was added first, as a way 
to express clearly the common pattern in old standard library functions of 
returning false on error. Then "true" was added later, for consistency. Both 
are newer, and far more exotic, than "null".

Disallowing true and false in some type checking contexts would be fine 
(although mostly they're pointless, rather than harmful). Disallowing or 
repurposing null would mean you have an incomplete type system, because there 
is no other type to match a null value against.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: AS assertions

2024-03-21 Thread Rowan Tommins [IMSoP]


On 21/03/2024 19:03, Robert Landers wrote:

I suppose we are taking this from different viewpoints, yours appears
to be more of a philosophical one, whereas mine is more of a practical
one.



My main concern is consistency; which is partly philosophical, but does 
have practical impact - the same syntax meaning the same thing in 
different contexts leads to less user confusion and fewer bugs.


But I also think there are real use cases for "error on anything other 
than either Foo or null" separate from "give me a null for anything 
other than Foo".



$x = $a as null;

(or any other value, such as true|false) appears to have no practical
purpose in this particular case.



There's plenty of possible pieces of code that have no practical 
purpose, but that on its own isn't a good reason to make them do 
something different.


"null" as a standalone type (rather than part of a union) is pretty much 
always pointless, and was forbidden until PHP 8.2. It's now allowed, 
partly because there are scenarios involving inheritance where it does 
actually make sense (e.g. narrowing a return type from Foo|null to 
null); and probably also because it's easier to allow it than forbid it.



That's not really what we're talking about anyway, though; we're talking 
about nullable types, or null in a union type, which are much more 
frequently used.





Further, reading "$x =
$a as null", as a native English speaker, appears to be the same as
"$x = null".



Well, that's a potential problem with the choice of syntax: "$x = $a as 
int" could easily be mistaken for "cast $a as int", rather than "assert 
that $a is int".


If you spell out "assert that $a is null", or "assert that $a is 
int|null", it becomes very surprising for 'hello' to do anything other 
than fail the assertion.




As I mentioned in the beginning, I see this mostly being used when
dealing with mixed types from built-in/library functions, where you
have no idea what the actual type is, but when you write the code, you
have a reasonable expectation of a set of types and you want to throw
if it is unexpected.



My argument is that you might have a set of expected types which 
includes null, *and* want to throw for other, unexpected, values. If 
"|null" is special-cased to mean "default to null", there's no way to do 
that.




Right now, the best way to do that is to simply
set a function signature and pass the mixed type to the function to
have the engine do it for you



And if you do that, then a value of 'hello' passed to a parameter of 
type int|null, will throw a TypeError, not give you a null.


As I illustrated in my last e-mail, you can even (since PHP 8.2) have a 
parameter of type null, and get a TypeError for any other value. That 
may not be useful, but it's entirely logical.




It makes more sense, from a practical programming
point-of-view, to simply return the value given if none of the types
match.


This perhaps is a key part of our difference: when I see 
"int|bool|null", I don't see any "value given", just three built-in 
types: int, which has a range of values from PHP_INT_MIN to PHP_INT_MAX; 
bool, which has two possible values "true" and "false"; and null, which 
has a single possible value "null".


So there are 2**64 + 2 + 1 possible values that meet the constraint, and 
nothing to specify that one of those is my preferred default if given 
something unexpected.



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: AS assertions

2024-03-21 Thread Rowan Tommins [IMSoP]


On 21/03/2024 15:02, Robert Landers wrote:

I don't think you are getting what I am saying.

$a as int|float

would be an int, float, or thrown exception.

$a as int|float|null

would be an int, float, or null.



I get what you're saying, but I disagree that it's a good idea.

If $a is 'hello', both of those statements should throw exactly the same 
error, for exactly the same reason - the input is not compatible with 
the type you have specified.





Another way of thinking about is:

$x = $a as null

What do you expect $x to be?



The same as $x inside this function:

function foo(null $x) { var_dump($x); }
foo($a);

Which is null if $a is null, and a TypeError if $a is anything else: 
https://3v4l.org/5UR5A



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: AS assertions

2024-03-21 Thread Rowan Tommins [IMSoP]

On 20/03/2024 23:05, Robert Landers wrote:
> In other
> words, I can't think of a case where you'd actually want a Type|null
> and you wouldn't have to check for null anyway.

It's not about having to check for null; it's about being able to distinguish 
between "a null value, which was one of the expected types" and "a value of an 
unexpected type".

That's a distinction which is made everywhere else in the language: parameter 
types, return types, property types, will all throw an error if you pass a Foo 
when a ?Bar was expected, they won't silently coerce it to null.

> If you think about it, in this proposal, you could use it in a match:
> 
> // $a is TypeA|TypeB|null
> 
> match (true) {
>   $a as ?TypeA => 'a',
>   $a as ?TypeB => 'b',
>   $a === null => 'null',
> }

That won't work, because match performs a strict comparison, and the as 
expression won't return a boolean true. You would have to do this:

match (true) {
  (bool)($a as ?TypeA) => 'a',
  (bool)($a as ?TypeB) => 'b',
  $a === null => 'null',
}
Or this:

match (true) {
  ($a as ?TypeA) !== null => 'a',
  ($a as ?TypeB) !== null => 'b',
  $a === null => 'null',
}

Neither of which is particularly readable. What you're really looking for in 
that case is an "is" operator:
match (true) {
  $a is TypeA => 'a',
  $a is TypeB => 'b',
  $a === null => 'null',
}
Which in the draft pattern matching RFC Ilija linked to can be abbreviated to:

match ($a) is {
  TypeA => 'a',
  TypeB => 'b',
  null => 'null',
}

Of course, in simple cases, you can use "instanceof" in place of "is" already:

match (true) {
  $a instanceof TypeA => 'a',
  $a instanceof TypeB => 'b',
  $a === null => 'null',
}

> Including `null` in that type
> seems to be that you would get null if no other type matches, since
> any variable can be `null`.
> 

I can't think of any sense in which "any variable can be null" that is not true 
of any other type you might put in the union. We could interpret Foo|false as 
meaning "use false as the fallback"; or Foo|int as "use zero as the fallback"; 
but I don't think that would be sensible.
In other words, the "or null on failure" part is an option to the "as" 
expression, it's not part of the type you're checking against. If we only 
wanted to support "null on failure", we could have a different keyword, like 
"?as":

$bar = new Bar;
$bar as ?Foo; // Error
$bar ?as Foo; // null (as fallback)

$null = null;
$null as ?Foo; // null (because it's an accepted value)
$null ?as Foo; // null (as fallback)

A similar suggestion was made in a previous discussion around nullable casts - 
to distinguish between (?int)$foo as "cast to nullable int" and (int?)$foo as 
"cast to int, with null on error".

Note however that combining ?as with ?? is not enough to support "chosen value 
on failure":

$bar = new Bar;
$bar ?as ?Foo ?? Foo::createDefault(); // creates default object

$null = null;
$null ?as ?Foo ?? Foo::createDefault(); // also creates default object, even 
though null is an expected value

That's why my earlier suggestion was to specify the fallback explicitly:

$bar = new Bar;
$bar as ?Foo else null; // null
$bar as ?Foo else Foo::createDefault(); // default object

$null = null;
$nulll as ?Foo else null; // null
$null as ?Foo else Foo::createDefault(); // also null, because it's an accepted 
value, so the fallback is not evaluated

Probably, it should then be an error if the fallback value doesn't meet the 
constraint:

$bar = new Bar;
$bar as Foo else null; // error: fallback value null is not of type Foo
$bar as ?Foo else 42; // error: fallback value 42 is not of type ?Foo

Regards,
-- 
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: AS assertions

2024-03-20 Thread Rowan Tommins [IMSoP]

On 20 March 2024 12:51:15 GMT, Robert Landers  wrote:

>Oh and there isn't any difference between:
>
>$x as ?Type
>
>or
>
>$x as Type|null

I'm not sure if I've misunderstood your example, or you've misunderstood mine.

I'm saying that this should be an error, because the value is neither an 
instance of Foo nor null:

$a = 42;
$b = $a as Foo|null;

Your earlier example implies that would make $b equal null, which feels wrong 
to me, because it means it wouldn't match this:

$a = 42;
$b = $a as Foo|Bar;

If we want a short-hand for "set to null on error" that should be separate from 
the syntax for a nullable type.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: AS assertions

2024-03-19 Thread Rowan Tommins [IMSoP]


On 19/03/2024 16:24, Robert Landers wrote:

$x = $attributeReflection->newInstance() as ?MyAttribute;
if ($x === null) // do something since the attribute isn't MyAttribute



I think reusing nullability for this would be a mistake - ideally, the 
right-hand side should allow any type, so "$foo as ?Foo" should mean the 
same as "$foo as Foo|null".



A better alternative might be to specify a default when the type didn't 
match:


$x = $attributeReflection->newInstance() as ?MyAttribute else null;
if ($x === null) // do something since the attribute isn't MyAttribute

Which then also allows you to skip the if statement completely:

$x = $attributeReflection->newInstance() as MyAttribute else 
MyAttribute::createDefault();



That then looks a lot like a limited-use version of syntax for catching 
an exception inline, which would be nice as a general feature (but I 
think maybe hard to implement?)


$x = somethingThatThrows() catch $someDefaultValue;


As well pattern matching, which Ilija mentioned, another adjacent 
feature is a richer set of casting operators. Currently, we can assert 
that something is an int; or we can force it to be an int; but we can't 
easily say "make this an int if safe, but throw otherwise" or "make this 
an int if safe, but substitute null/$someValue otherwise".


I've been considering how we can improve that for a while, but not 
settled on a firm proposal - there's a lot of different versions we 
*could* support, so choosing a minimal set is hard.



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] Proposal: Arbitrary precision native scalar type

2024-03-18 Thread Rowan Tommins [IMSoP]


On 18/03/2024 04:39, Alexander Pravdin wrote:

I'm not in the context of the core team plans regarding "strict
types". Could you share some details here? What is the current plan
regarding it? To make strict types on by default eventually? Or
something else?



PHP doesn't really have a defined "core team". There are contributors 
who are particularly active at a given time (sometimes, but far from 
always, because someone is paying them), contributors who are 
particularly long-standing and respected, contributors who throw 
themselves into a pet project and make it happen, and so on.


Partly as a consequence of this, it's often hard to pin down any 
long-term plan about anything, outside of what particular people would 
like to see. So Gina's opinion (it was suffixed "IMHO") that strict 
types was a mistake shouldn't be read as "we have a solid plan for what 
is going to replace strict_types which everyone is on board with".


I think a reasonable number of people do share the sentiment that having 
two separate modes was a mistake; and neither mode is actually perfect. 
It's not about "making it on by default", it's about coming up with a 
unified behaviour that makes the setting redundant.



All of which is something of a diversion from the topic at hand, which 
is this:



How can we introduce the ability to write user code in default
decimals and at the same time keep the old way of working as it was
before, to not introduce any troubles into the existing code and not
introduce performance issues? As a user, I would like to have a
choice.




I don't think choice is really what you want: if you were designing a 
language from scratch, I doubt you would say "let's give the user a 
choice of what type 1 / 10 returns". What it's actually about is 
*backwards compatibility*: what will happen to code that expects 1/10 to 
give a float, if it suddenly starts giving a decimal.


For most cases, I think the rule can be as simple as "decimal in means 
decimal out". What's maybe not as obvious at first sight is that that 
can apply to operators as functions, and already does: 100 / 10 gives 
int(10), but 100.0 / 10  gives float(10.0), as do 100 / 10.0  and 100.0 
/ 10.0


By the same logic, decimal(1) / 10 can produce decimal(0.1) instead of 
float(0.1), and we don't need any fancy directives. Even better if we 
can introduce a shorter syntax for decimal literals, so that it becomes 
1_d / 10



Where things get more complicated is with *fixed-precision* decimals, 
which is what is generally wanted for something like money. What is the 
correct result of decimal(1.03, precision: 2) / 2 - decimal(0.515, 3)? 
decimal(0.51, 2)? decimal (0.52, 2)? an error? And what about 
decimal(10) / 3?


If you stick to functions / methods, this is slightly less of an issue, 
because you can have decimal(1.03, 2)->dividedBy(2, RoundingMode::DOWN) 
== decimal(0.51, 2); or decimal(1.03, 2)->split(2) == [ decimal(0.52, 
2), decimal(0.51, 2) ] Example names taken directly from the brick/money 
package.


At that point, backwards compatibility is less of an issue as well: make 
the new functions convenient to use, but distinct from the existing ones.



In short, the best way of avoiding declare() directives is not to 
replace them with something else, but to choose a design where nobody 
feels the need for them.


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-03-18 Thread Rowan Tommins [IMSoP]


On 18/03/2024 00:04, Ilija Tovilo wrote:

I realize this is somewhat inconsistent, but I believe it is
reasonable. If you want to expose the underlying property
by-reference, you need to jump through some additional hoops.



I disagree with this reasoning, because I foresee plenty of cases where 
a virtual property is necessary anyway, so doesn't provide any 
additional hoop to jump through.


But there's not much more to say on this point, so I guess we'll leave 
it there.





Again, it depends on how you think about it. As you have argued, for a
get-only property, the backing value should not be writable without an
explicit `set;` declaration. You can interpret `set;` as an
auto-generated hook, or as a marker that indicates that the backing
value is accessible without a hook.



Regardless of which of these views you start with, it still seems 
intuitive to me that accesses inside the get hook would bypass the 
normal rules and write to the raw value.


Leaving aside the implementation, there are three things that can happen 
when you write to a property:


a) the set hook is called
b) the raw property is written to
c) an error is thrown

Inside the dynamic scope of a hook, the behaviour is always (b), and I 
don't see any reason for that to change. From anywhere else, backed 
properties currently try (a) and fall back to (b); virtual properties 
try (a) and fall back to (c).


I do understand that falling back to (b) makes the implementation 
simpler, and works well with inheritance and some use cases; but falling 
back to (c) wouldn't necessarily need a "default hook", just a marker of 
"has hooks".


It occurred to me you could implement it in reverse: auto-generate a 
hook "set => throw new Error;" and then *remove* it if the user opts in 
to the default set behaviour. That would keep the "write directly" case 
optimised "for free"; but it would be awkward for inheritance, as you'd 
have to somehow avoid calling the parent's hook.





The meaning for `set;` is no longer clear. Does it mean that there's a
generated hook that accesses the backing field? Does it mean that the
backing field is accessible without a hook? Or does it mean that it
accesses the parent hook? The truth is, with inheritance there's no
way to look at the property declaration and fully understand what's
going on, unless all hooks must be spelled out for the sake of clarity
(e.g. `get => parent::$prop::get()`).



Yes, I think this is probably a good argument against requiring "set;"

I think "be careful when inheriting only one hook" will always be a key 
rule to teach anyway, because it's easy to mess up (e.g. assuming the 
parent is backed and accessing $this->foo, rather than calling the 
parent's hook implementation). But adding "set;" into the mix probably 
just makes it worse.





I seriously doubt accessing the backing value outside of the current
hook is useful. The backing value is an implementation detail. If it
is absolutely needed, `ReflectionProperty::setRawValue()` offers a way
to do it. I understand the desire for a shorter alternative like
`$field`, but it doesn't seem like the majority shares this desire at
this point in time.



The example of clearAll() is a real use case, which people will 
currently achieve with __get and __set (e.g. the Yii ActiveRecord 
implementation I linked in one of my previous messages).


The alternative wouldn't be reflection, it would just be switching to a 
virtual property with the value stored in a private field. I think 
that's fine, it's just drawing the line of which use cases backed 
properties cover: Kotlin covers more use cases than C#; PHP will cover 
more than Kotlin (methods able to by-pass a hook when called from that 
hook); but it will draw the line here.





A different syntax like `$this->prop::raw` comes with similar
complexity issues, similar to those previously discussed for
`parent::$prop`/`parent::$prop = 'prop'`.



Yeah, I can't even think of a nice syntax for it, let alone a nice 
implementation. Let's leave it as a thought experiment, no further 
action needed. :)



Regarding asymmetric types:


I can't speak for IDEs or static
analyzers, but I'm not sure what makes this case special. We can ask
some of their maintainers for feedback.



In order to reliably tell the user whether "$a->foo = $b->bar;" is a 
type-safe operation, the analyser will need to track two types for every 
property, the "gettable type" and the "settable type", and apply them in 
the correct contexts.


I've honestly no idea whether that will be easy or hard; it will 
probably vary between tools. In particular, I get the impression IDEs / 
editor plugins sometimes have a base implementation used for multiple 
programming languages, and PHP might be the only one that needed this 
extra tracking.



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [Pre-RFC] Improve language coherence for the behaviour of offsets and containers

2024-03-17 Thread Rowan Tommins [IMSoP]


On 11/03/2024 12:52, Gina P. Banyard wrote:

I would like to get some initial feedback on an RFC I've been working on for 
the last 5–6 months.
The RFC attempts to explain, and most importantly, improve the semantics around 
$container[$offset] as PHP is currently widely inconsistent.

[...]

RFC: 
https://github.com/Girgias/php-rfcs/blob/master/container-offset-behaviour.md



Hi Gina,

I've just read through this thoroughly, and am simultaneously impressed 
with your investigation, and amazed at how many inconsistencies you found.



I think the proposed granular interfaces absolutely make sense, given 
the different uses people have for such offsets. My only hesitation is 
that if you want "everything", things become quite verbose:


class Foo implements DimensionFetchable, DimensionWritable, 
FetchAppendable, DimensionUnsettable { ... }


function 
bar(DimensionFetchable 
$container) { ... }


Unfortunately, I can't think of an easy solution to this without some 
form of type aliases.



As an experiment, I tried writing a variation of Python's "defaultdict" 
[1] using all the new hooks (without actually testing it against any 
implementation). Here's what I came up with: 
https://gist.github.com/IMSoP/fbd60c5379ccefcab6c5af25eacc259b


Most of it is straight-forward, but a couple of things stood out:

* Separating offsetFetch from offsetGet is really useful, because we can 
avoid "auto-vivifying" a key that's only been read, never updated. In 
other words, isset($foo['a']) can remain false after running 
var_dump($foo['a']), but $foo['a']++ should still work.


* The fetchAppend hook is quite confusing to implement, because it's 
used in a few subtly different scenarios. For instance, if it's actually 
$container[][$offset] = $value there is an implicit requirement that 
fetchAppend should return array|DimensionWritable, but presumably that 
has to be enforced after fetchAppend has returned. I'm not sure if 
there's anything that can be improved here; it probably just needs some 
examples in the user manual.


[1] 
https://docs.python.org/3/library/collections.html#collections.defaultdict



Over all, I think this is a really great proposal, and hope it proceeds 
smoothly.


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-03-17 Thread Rowan Tommins [IMSoP]

stions I 
would expect to come up if this feature had its own RFC.


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-03-16 Thread Rowan Tommins [IMSoP]


On 16/03/2024 17:51, Ilija Tovilo wrote:

Properties can inherit both storage and hooks from their parent.
Hopefully, that helps with the mental model. Of course, in reality it
is a bit more complicated due to guards and references.



That is a really helpful explanation, thanks; I hadn't thought about the 
significance of inheritance between hooked and non-hooked properties.


I still think there will be a lot of users coming from other languages, 
or from using __get and __set, who will look at virtual properties 
first. Making things less surprising for those people seems worth some 
effort, but I'm not asking for a complete redesign.





Dynamic properties are not particularly relevant today. The point was
not to show how similar these two cases are, but to explain that
there's an existing mechanism in place that works very well for hooks.
We may invent some new mechanism to access the backing value, like
`field = 'value'`, but for what reason? This would only make sense if
the syntax we use is useful for something else. However, given that
without guards it just leads to recursion, which I really can't see
any use for, I don't see the point.



I can think of several reasons we *could* explore other syntax:

1) To make it clearer in code whether a particular line is accessing via 
the hooks, or by-passing them 2) To make the code in the hooks shorter 
(e.g. `$field` is significantly shorter than 
`$this->someDescriptiveName`) 3) To allow code to by-pass the hooks at 
will, rather than only when called from the hooks (e.g. having a single 
method that resets the state of several lazy-loaded properties)


Those reasons are probably not enough to rule out the current syntax; 
but they show there are trade-offs being made.


To be honest, my biggest hesitation with the RFC remains asymmetric 
types (the ability to specify types in the set hook). It's quite a 
significant feature, with no precedent I know of, and I'm worried we'll 
overlook something by including it immediately. For instance, what will 
be the impact on people using reflection or static analysis to reason 
about types? I would personally be more comfortable leaving that to a 
follow-up RFC to consider the details more carefully.


Nobody else has raised that, beyond the syntax; I'm not sure if that's 
because everyone is happy with it, or because the significance has been 
overlooked.



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-03-16 Thread Rowan Tommins [IMSoP]

On 16 March 2024 00:19:57 GMT, Larry Garfield  wrote:

>Well, reading/writing from within a set/get hook is an obvious use case to 
>support.  We cannot do cached properties easily otherwise:
>
>public string $expensive {
>  get => $this->expensive ??= $this->compute();
>  set { 
>if (strlen($value) < 50) throw new Exception();
>$this->expensive = $value;
>  }
>}

To play devil's advocate, in an implementation with only virtual properties, 
this is still perfectly possible, just one declaration longer:

private string $_expensive;
public string $expensive {
  get => $this->_expensive ??= $this->compute();
  set { 
if (strlen($value) < 50) throw new Exception();
$this->_expensive = $value;
  }
}

Note that in this version there is an unambiguous way to refer to the raw value 
from anywhere else in the class, if you wanted a clearAll() method for instance.

I can't stress enough that this is where a lot of my thinking comes from: that 
backed properties are really the special case, not the default. Anything you 
can do with a backed property you can do with a virtual one, but the opposite 
will never be true.

The minimum version of backed properties is basically just sugar for that - the 
property is still essentially virtual, but the language declares the backing 
property for you, leading to:

public string $expensive {
  get => $field ??= $this->compute();
  set { 
if (strlen($value) < 50) throw new Exception();
$field = $value;
  }
}

I realise now that this isn't actually how the current implementation works, 
but again I wanted to illustrate where I'm coming from: that backed properties 
are just a convenience, not a different type of property with its own rules.

> Being the same also makes the language more predictable, which is also a 
> design goal for this RFC.  (Hence why "this is the same logic as 
> methods/__get/other very similar thing" is mentioned several times in the 
> RFC.  Consistency in expectations is generally a good thing.)

I can only speak for myself, but my expectations were based on:

a) How __get and __set are used in practice. That generally involves reading 
and writing a private property, of either the same or different name from the 
public one; and that private property is visible everywhere equally, no special 
handling based on the call stack.

b) What happens if you accidentally cause infinite recursion in a normal 
function or method, which is that the language eventually hits a stack depth 
limit and throws an error.

So the assertion that the proposal was consistent with expectations surprised 
me. It feels to me like something that will seem surprising to people when they 
first encounter it, but useful once they understand the implications.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-03-15 Thread Rowan Tommins [IMSoP]

On 15 March 2024 17:11:29 GMT, Larry Garfield  wrote:
>On Wed, Mar 13, 2024, at 10:26 PM, Rowan Tommins [IMSoP] wrote:
>> I think it would be more helpful to justify this design on its own 
>> merits, particularly because it's a significant difference from other 
>> languages (which either don't have a "real property" behind the hooks, 
>> or in Kotlin's case allow access to it only *directly* inside the hook 
>> definitions, via the "field" keyword).
>
>I'm not sure I follow.  The behavior we have currently is very close to how 
>Kotlin works, from a user perspective.

Unless I'm misunderstanding something, the backing field in Kotlin is 
accessible only inside the hooks, nowhere else. I don't know what would happen 
if a hook caused a recursive call to itself, but there's no mention in the docs 
of it bypassing the hooks, only this:

> This backing field can be referenced in the accessors using the `field` 
> identifier

and

> The `field` identifier can only be used in the accessors of the property.

And then a section explaining that more complex hooks should use a separate 
backing property - which is the only option in C#, and roughly what people 
would do in PHP today with __get and __set.

Kotlin does have a special syntax for "delegating" hooks, but looking at the 
examples, they do not use the backing field at all, they have to provide their 
own storage.

>I've lost track of which specific issue you have an issue with or would want 
>changed.  The guards to prevent an infinite loop are necessary, for the same 
>reasons as they are necessary for __get/__set.

I understand that *something* needs to happen if a recursive call happens, but 
it could just be an error, like any other unbounded recursion. 

I can also understand the temptation to make it something more useful than an 
error, and provide a way to access the "backing field" / "raw value" from 
outside the hook. But it does lead to something quite surprising: the same line 
of code does different things depending on how it is called.

I doubt many people have ever discovered that __get and __set work that way, 
since as far as I can see it's only possible to use deliberately if you're 
dynamically adding and unsetting properties inside your class.

So, I don't necessarily think hooks working that way is the wrong decision, I 
just think it's a decision we should make consciously, not one that's obvious.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-03-13 Thread Rowan Tommins [IMSoP]


On 12/03/2024 22:43, Larry Garfield wrote:

It's slightly different, yes.  The point is that the special behavior of a hook is 
disabled if you are within the call stack of a hook, just like the special behavior of 
__get/__set is disabled if you are within the call stack of __get/__set.  What happens 
when you hit an operation that would otherwise go into an infinite loop is a bit 
different, but the "disable to avoid an infinite loop" logic is the same.



I guess I'm looking at it more from the user's point of view: it's very 
rare with __get and __set to have a method that sometimes accesses the 
"real" property, and sometimes goes through the "hook". Either there is 
no real property, or the property has private/protected scope, so any 
method on the classes sees the "real" property *regardless* of access 
via the hook.


I think it would be more helpful to justify this design on its own 
merits, particularly because it's a significant difference from other 
languages (which either don't have a "real property" behind the hooks, 
or in Kotlin's case allow access to it only *directly* inside the hook 
definitions, via the "field" keyword).





The point is to give the user the option for full backwards compatibility when it makes 
sense. This requires jumping through some hoops, which is the point. This is essentially 
equivalent to creating a by-ref getter + a setter, exposing the underlying property. By 
creating a virtual property, we are "accepting" that the two are detached. While we 
could disallow this, we recognize that there may be valid use-cases that we'd like to enable. 
 It also parallels __get/__set, where using &__get means you can write to something 
without going through __set.



I get the impression that to you, it's a given that a "virtual property" 
is something clearly distinct from a "property with hooks", and that 
users will consciously decide between one and the other.


This isn't my expectation; based on what people are used to from 
existing features, and other languages, I expect users to see this as an 
obvious starting point for defining a hooked property:


private int $_foo;
public int $foo { get => $this->_foo; set { $this->_foo = $value; } {

And this as a convenient short-hand for exactly the same thing:

public int $foo { get => $this->foo; set { $this->foo = $value; } }

Choosing one or the other won't feel like "jumping through a hoop", and 
the ability to use an  hook with one and not the other will simply 
seem like a weird oddity.





In practice I expect it virtual properties with both hooks to be very rare.  
Most virtual properties will, I expect, be lazy-computed get-only values.



I don't think this is true. Both of these are, in the terms of the RFC, 
"virtual properties":


public Something $proxied { get => $this->otherObject->thing; set { 
$this->otherObject->thing = $value; } };


public Money $price;
public int $pricePence { get => $this->price->asPence(); set { 
$this->price = Money::fromPence($value); } }


I can also imagine generated classes with "virtual" properties which 
call out to generic "getCached" and "setAndClearCache" methods doing the 
job of this pair of __get and __set methods: 
https://github.com/yiisoft/yii2/blob/master/framework/db/BaseActiveRecord.php#L274







With the change to allow  in the absence of set, I believe that would 
already work.

cf:https://3v4l.org/3Gnti/rfc#vrfc.property-hooks



Awesome! The RFC should probably highlight this, as it gives a 
significant extra option for array properties.


(Also, good to know 3v4l has a copy of the branch; I hadn't thought to 
check.)



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-03-12 Thread Rowan Tommins [IMSoP]

On 08/03/2024 15:53, Larry Garfield wrote:

Hi folks.  Based on earlier discussions, we've made a number of changes
to the RFC that should address some of the concerns people raised.  We
also had some very fruitful discussions off-list with several developers
  from the Foundation, which led to what we feel are some solid
improvements.

https://wiki.php.net/rfc/property-hooks

Hi Larry,

Thanks again for the continuing hard work on this!

> if a |get| hook for property |$foo| calls method |bar()|, then inside 
that method |$this->foo| will refer to the raw property, both read and 
write. If |bar()| is called from somewhere other than the hook, reading 
from |$this->foo| will trigger the |get| hook. This behavior is 
identical to that already used by |__get| and |__set| today.

I'm slightly confused by this.

If there is an actual property called $foo, then __get and __set will be 
called only when it is out of visibility, regardless of the call stack - 
e.g. a private property will always trigger __get from public scope, and 
always access it directly from private scope: https://3v4l.org/R5Yos 
That seems differ from what's proposed, where even a private call to 
bar() would trigger the hook.

The protection against recursion appears to only be relevant for 
completely undefined properties. For __get, the direct access can never 
do anything useful - there's nothing to access: https://3v4l.org/2nDZS 
For __set, it is at least possible for the non-recursive write to 
succeed, but only in the niche case of creating a dynamic property: 
https://3v4l.org/dpYOj I'm not sure that there's any equivalent to this 
scenario for property hooks, since they can never be undefined/dynamic.

> There is one exception to the above: if a property is virtual, then 
there is no presumed connection between the get and set operations. 
[...] For that reason, || by reference is allowed for virtual 
properties, regardless of whether or not there is a |set| hook.

I don't agree with this, and the example immediately following it 
demonstrates the exact opposite: the  and set hooks are both 
proxying to the same backing value, and have all the same problems as if 
the property was non-virtual. I would imagine a lot of real-life virtual 
properties would be doing something similar: converting to/from a 
different type, proxying to another object, etc.

I think this exception is unnecessarily complicated: either trust users 
to handle the implications of combining  with set, or forbid it.

> Additionally, || hooks are allowed for arrays as well, provided 
there is no |set| hook.

I mentioned in a previous e-mail the possibility of using the  hook 
for array writes. Has this been considered?

That is:

$c->arr['beep'] = 'boop';

Would be equivalent to:

$temp =& $c->arr;
$temp['beep'] = 'boop';
unset($temp);

Which would be valid if $arr had an  hook defined.

> A |set| hook on a typed property must declare a parameter type that 
is the same as or contravariant (wider) from the type of the property.

> Once a property has both a |get| and |set| operation, however, it is 
no longer covariant or contravariant for further extension.

How do these two rules interact?

Could this:

public string $foo {
   get => $this->_foo;
   set(string|Stringable $value) {
   $this->_foo = (string)$value;
   }
}

be over-ridden by this, where the property's "main type" remains 
invariant but its "settable type" is contravariant?

public string $foo {
   get => $this->_foo;
   set(string|Stringable|SomethingElse $value) {
   $this->_foo = $value instanceof SomethingElse ? 
$value->asString() : (string)$value;

   }
}

> ReflectionProperty has several new methods to work with hooks.

There should be some way to reliably determine the "settable type" of a 
property. At the moment, I think you would have to do something like this:

$setHook = $property->getHook(PropertyHookType::Set);
$writeType = $setHook === null ? $property->getType() 
: $setHook->getParameters()[0]->getType();

Once again, I would like to make the case that asymmetric types are an 
unnecessary complication that should be left to Future Scope.

The fact that none of the other languages referenced have such a feature 
should also give us pause. There's nothing to stop us being the first to 
innovate a feature, but we should be extra cautious when doing so, with 
no previous experience to learn from. It also means there is no 
expectation from users coming from other languages that this will be 
possible.

If it genuinely seems useful, it can be added in a follow-up RFC, or 
even a later version of PHP, with little impact on the rest of the 
feature. But if we add it now and regret it, or some detail of its 
implementation, we will be stuck with it forever.

Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-03-04 Thread Rowan Tommins [IMSoP]

ed (or, it seemed to me) that $field was just 
an alias for referencing the "real" property. That's a really tempting 
interpretation, but it's not what's happening.

What's really happening is that the property itself is virtual: every single 
access to it goes through the hooks. But, within the hooks, we have provided a 
magic variable, stored on the object but accessible only there, where the hooks 
can store a value of the same type as the virtual property.

Once I came to that interpretation, it became much more intuitive to call that 
magic variable by a magic name like $field; than to re-use the syntax that 
would normally refer to the property, and make it sometimes reference this new 
thing instead.

To re-iterate an earlier point, though, I think the language should choose. 
There should be exactly one way to refer to the backing field, whether that's 
$this->foo, $field, or get_backing_field(). Don't leave users reading each 
other's code and not being sure if it's doing the same thing.


Regards,
-- 
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-02-27 Thread Rowan Tommins [IMSoP]


On 27/02/2024 17:49, Erick de Azevedo Lima wrote:
> It sounds like most people are just really, really pissed off by an 
implicit variable


I think that it could be good to follow the PHP way to mark the 
"magic" stuff, which is putting leading underscores on the magic stuff.



I think that might help; I also think that even if the RFC offers a 
choice to the list, the final implementation should not offer choice to 
users.


I think part of what put people off with the original wording was that 
it implied $field was an alias for $this->propertyName, but the alias 
was "preferred". The reality is that we have a new thing that we need a 
name/syntax for, and $field or $this->propertyName are possible options.


To avoid another lengthy e-mail, I've put together some alternative RFC 
wording. The main idea is to switch the framing from "hooks on top of 
properties, which may be virtual" to "hooked properties which are 
virtual by default, but may access a special backing field".


As noted in the introduction this is *not* intended as a 
counter-proposal or critique, just somewhere to collate my thoughts and 
suggestions: https://wiki.php.net/rfc/property-hooks/imsop-suggestion


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-02-27 Thread Rowan Tommins [IMSoP]

On 26 February 2024 23:11:16 GMT, Frederik Bosch  wrote:

>And what happens in the following situation, how are multiple get calls 
>working together?
>
>public string $fullName {
>    get => $this->first . ' ' . $this->last; // is this accessing the backed 
>value, or is it accessing via get
>    set($value) => $this->fullName = $value;
>}
>
>public string $first {
>    get => explode(' ', $this->fullName)[0], // is this accessing the backed 
>value, or is it accessing via get
>    set($value) => $value;
>}

I don't think it's *that* confusing - the rule is not "hooks vs methods", it's 
"special access inside the property's own hook". But as I say, I'm coming 
around to the idea that using a different name for that "backing field" / "raw 
value" might be sensible.

>> What would happen if a setter contained both "return 42;" and "return;"? The 
>> latter is explicitly allowed in "void" functions, but is also allowed in a 
>> non-void function as meaning "return null;"
>return 42; // returns (int)42
>return; // early return, void, same as no return
>return null; // returns null

I'm not sure if you misunderstood my question, or just the context of why I 
asked it. I'm talking about a hook like this:

set($value) { if ($value) { return 42; } else { return; } }

Currently, the only definition of "void" in the language is that a void 
function must not contain an explicit return value. We could turn that check 
around, and deduce that a certain hook is void. This hook would not pass that 
check, so we would compile it to have an assignment, and the false case would 
assign null to the property. To avoid that, we would need some additional 
analysis to prove that in all possible paths, a return statement with a value 
is reached.

The alternative would be to run the code, and somehow observe that it "returned 
void". But "void" isn't a value we can represent at run-time; we would need to 
set the return value to some special value just for this specific case. We 
would have to turn that on just for hook bodies, as returning it from normal 
functions would be a huge BC break, and also not very useful - with union 
types, there would be plenty of better options for a function to indicate a 
return value that needs special handling.

>$generator = setCall($class, 'first', $value);
>foreach ($generator as $value) {
>   writeProperty($class, 'first', $value);
>}
>if ($generator->hasReturn()) {
>writeProperty($class, 'first', $generator->getReturn());
>}

That's already an order of magnitude more complicated than "the return value is 
used on the right-hand side of an assignment", and it's missing at least one 
case: set($value) { return $value; } will not compile to a generator, so needs 
to skip and assign the value directly.

By "magic", what I meant was "hidden logic underneath that makes it work". 
Assign-by-return has a small amount of magic - you can express it in half a 
line of code; assign-by-yield has much more magic - a whole bunch of loops and 
conditionals to operate your coroutine.

> The yield is much more intuitive than magic fields

I think we'll just have to differ in opinion on that one. Maybe you're just 
more used to working with coroutines than I am.

Note that yield also doesn't solve how to read the current backing value in a 
get hook (or a set hook that wants to compare before and after), so we still 
need some way to refer to it.

Regards,
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-02-26 Thread Rowan Tommins [IMSoP]


On 26/02/2024 20:21, Frederik Bosch wrote:
I do note that $this->propName might suggest that the backing value is 
accessible from other locations than only the property's own get/set 
methods, because of $this usage. 



Yes, I actually stumbled over that confusion when I was writing some of 
the examples in my lengthy e-mail in this thread. As I understand it, 
this would work:


public string $foo {
    get { $this->foo ??= 0; $this->foo++; return $this->foo; }
    set { throw new Exception; }
}

Outside the hooks, trying to write to $this->foo would throw the 
exception, because it refers to the hooked property as a whole; but 
inside, the same name refers to something different, which isn't 
accessible anywhere else.


Now that I've looked more at how Kotlin uses "field", I understand why 
it makes sense - it's not an alias for the property itself, but the way 
to access a "backing store" which has no other name.


Using $this->foo as the name is tempting if you think of hooks as 
happening "on top of" the "real" property; but that would be a different 
feature, like Switft's "property observers" (willSet and didSet). What's 
really happening is that we're declaring two things at once, and giving 
them the same name; almost as if we'd written this:


public string $foo {
    get { static $_foo; $_foo ??= 0; $_foo++; return $_foo; }
    set { throw new Exception; }
}

Kotlin's "field" is kind of the equivalent of that "static $_foo"



Regarding returning void=null, this is something that IDE and static 
analyzers already pick-up as an error. I think being stricter on that 
in this RFC would actually make sense, and treat void not as null.




What would happen if a setter contained both "return 42;" and "return;"? 
The latter is explicitly allowed in "void" functions, but is also 
allowed in a non-void function as meaning "return null;"



And why yield is magic, I do not get that. The word and the expression 
actually expresses that something is, well, yielded.




But yielded to where? My mental model of "return to set" is that this:

public string $name { set($value) { $x = something($value); return $x + 
1; } }


Is effectively:

private function _name_set($value) { $x = something($value); return $x + 
1; } }

plus:
$this->name = $this->_name_set($value);

With "yield", I can't picture that simple translation; the "magic" is 
whatever translates the "yield" keyword into "$this->name ="


I would file it with the type widening in the RFC: seems kind of cool, 
but probably isn't worth the added complexity.



Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-02-26 Thread Rowan Tommins [IMSoP]


On 26/02/2024 19:02, Frederik Bosch wrote:


That's how it always has been, no? So in your example, short code 
abbreviated form would not work. One has to write a block.


 public  string$fullName  {  
 set=>  [$this->first,  $this->last]  =  explode  <http://www.php.net/explode>(' ',  \ucfirst  <http://www.php.net/ucfirst>($value));  // error, $fullName is a string, returning array

 }
  
 public  string$fullName  {  
 set{

 [$this->first,  $this->last]  =  explode  <http://www.php.net/explode>(' 
',  \ucfirst  <http://www.php.net/ucfirst>($value));  // no error, not returning
 }
 }



I think the intention is that both the block and the arrow syntax would 
have any return value ignored, as happens with constructors, for 
example. Note that in PHP, there is actually no such thing as "a 
function not returning a value", even a "void" function actually returns 
null; so if the return value was treated as meaningful, your second 
example would give an error "cannot assign null to property of type string".


However, as noted in a previous message, I agree that the short form 
meaning "the value returned is saved to the backing field" is both more 
expected and more useful.


The "yield" idea is ... interesting. I think personally I find it a bit 
too magic, and too cryptic to be more readable than an explicit 
assignment. Opinions may vary, though.


Regards,

--
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] is this thing on?

2024-02-26 Thread Rowan Tommins [IMSoP]

On Sun, 25 Feb 2024, at 20:02, Rob Landers wrote:
> Before I get to the meat of this email, first of all, IMHO, anyone should be 
> able to email the list, even if they are not a 
> member of the list. I've had to email ubuntu lists about bugs before and I 
> really have no desire to join those lists, but
> I was always able to just send the email to the list just fine.

The biggest problem with an open list is how to manage spam - if you don't 
catch the spam on the list server, it not only ends up in hundreds of inboxes, 
but in multiple archives and mirrors of the list. I don't know how the lists 
you mentioned handle that.

This has also come up in the past regarding moving from e-mail to 
$currently_fashionable_technology - having some barrier to entry is actually 
quite useful, since we want people to put some effort into their contributions 
beyond "me too" or "I had this crazy idea in the pub".

Note that this is exactly why bugs.php.net was abandoned: there was too much 
spam and low-quality content.

> Now for the issue:
> 
> gmail is failing to send emails to the list (hence why it has probably been a 
> bit quite around here). Here is the error:
> 
> The response from the remote server was:
> 451 4.3.0 : Temporary lookup failure

People are aware of this issue, and looking into it. In case you missed the 
previous thread, two things have unfortunately happened at once:

- The mailing list was moved to a new server
- GMail rolled out a much tighter set of anti-spam rules

It's not immediately clear which of these is responsible for the 451 errors, 
but as I say, people are working on it.

> Now, to go figure out how to unsubscribe this email from the list...

Exactly the same way you subscribed, I believe: via the web form, or using 
+unsubscribe in the to address.

Regards,
-- 
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-02-26 Thread Rowan Tommins [IMSoP]

On Thu, 22 Feb 2024, at 23:56, Larry Garfield wrote:
> However, I just had a long discussion with Ilija and there is one 
> possibility we could consider: Use the return value only on the 
> shorthand (arrow-function-like) syntax.
>
> So you could do either of these, which would be equivalent:
>
> set {
>   $this->phone = $this->santizePhone($value);
> }
>
> set => $this->santizePhone($value);

Regarding this point, I've realised that the current short-hand set syntax 
isn't actually any shorter:

set { $this->phone = $this->santizePhone($value); }
set => $this->phone = $this->santizePhone($value);

It also feels weird to say both "the right-hand side must be a valid 
expression" and "the value of the expression is ignored".

So I think making the short-hand be "expression to assign to the implicit 
backing field" makes a lot more sense.

Regards,
-- 
Rowan Tommins
[IMSoP]

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

2024-02-25 Thread Rowan Tommins [IMSoP]

name{ get => new  UnicodeString($this->name_string);
set=> $this->name_string  =  (string)$value;
}
public string $name_string;

This exotic "asymmetric typing" is then being used to justify other 
decisions - if you can specify setter's the type, it's confusing if you 
specify a name without a type; so we need to make the name optional as 
well... Compare to C#, where "value" is not a default, it's an 
unchangeable keyword; or Kotlin, where naming it is mandatory but 
doesn't have mention type.



I think my concerns about distinguishing "virtual properties" may stem 
from a similar cause.


In C#, all "properties" are virtual - as soon as you have any 
non-default "get", "set" or "init" definition, it's up to you to declare 
a separate "field" to store the value in. Swift's "computed properties" 
are similar: if you have a custom getter or setter, there is no backing 
store; to add behaviour to a "stored property", you use the separate 
"property observer" hooks.


Kotlin's approach is philosophically the opposite: there are no fields, 
only properties, but properties can access a hidden "backing field" via 
the special keyword "field". Importantly, omitting the setter doesn't 
make the property read-only, it implies set(value) { field = value }


The current RFC attempts to combine all of these ideas into one syntax, 
on top of everything the language already has. The result has some 
odd-shaped corners. For instance, this won't work:


public string $name { set => throw new Exception('Read-only property ' . 
__PROPERTY__); }


But this will:

public string $name { set => throw new Exception('Read-only property ' . 
__PROPERTY__ . '; current value is: ' . $this->name); }


The first declares a virtual property, with no default getter, like in 
C# or Swift. The second instead acts like Kotlin, and has a default 
getter referencing the implicit backing field.


It would be clearer to choose one style or the other: explicitly enable 
the defaults...


public string $name { get; set => throw new Exception('Read-only 
property ' . __PROPERTY__); }    // default getter and backing field 
requested
public string $name { get => $this->name ??= $this->generateName(); }    
// setter disabled because it's not mentioned, even though backing field 
is used


...or explicitly disable them:

public string $name { set => throw new Exception('Read-only property ' . 
__PROPERTY__ }    // implied default getter and backing field
public virtual string $name { get => $this->firstName . ' ' . 
$this->lastName; }    // setter disabled because property is declared 
virtual



I think there's some really great functionality in the RFC, and would 
love for it to succeed in some form, but I think it would benefit from 
removing some of the "magic".



Regards,

--
Rowan Tommins
[IMSoP]

71 matches

Mail list logo