Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-02 Thread Saki Takamachi
PS: Yep, this is pretty much Jordan's library idea.

Saki


Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-02 Thread Saki Takamachi
Hi Jordan,

> The issue is that, presumably, this method will be used within the operator 
> overload portion of the class entry in C. If it is allowed to be overridden, 
> then this RFC is sort of providing a stealth operator overload to PHP 
> developers. As much as I am for operator overloads having written an RFC for 
> it, and as much as I find the arguments generally against it lacking, I am 
> not in favor of doing it that way with a kind of... unspoken capability to 
> overload the basic math operators in userland. I very much like the feature, 
> but I also think it should be intentionally and specifically designed, which 
> is why I spent a long time on it. I do not get a vote for RFCs, but I would 
> vote against this if I could just for that reason IF the calculation methods 
> were not private, the class was not final, AND the function entry was used in 
> the operator overload.
> 
> And operator overloads are also the place where what you outlined above gets 
> murky. I think what you outlined is very close to a good final design for 
> just the method usage side, but the operator usage side CANNOT provide a 
> scale or a rounding mode. That should be taken into consideration, because 
> allowing this object to be used with operators is probably the single largest 
> benefit this RFC will provide to PHP developers.
> 
> What I ended up doing was that the VALUE of the object was immutable, but the 
> other information was not immutable. That has its own downsides, but does 
> allow for very explicit control from the developer at the section of code 
> using the class, but also avoids creating copies of the object or 
> instantiating a new object for every single "setting" change during 
> calculations.

Agree. I also have negative thoughts about this, but I wanted to hear 
everyone's opinions, so I sent the email I mentioned earlier.

If make a class final, users will not be able to add arbitrary methods, so I 
think making each method final. Although it is possible to completely separate 
behavior between method and opcode calculations, this is inconsistent and 
confusing to users and should be avoided.

> I should clarify, the portion of your outline that I feel is not sufficient 
> for the operator overload use case is that there is no way to use both 
> operator overloads AND a scale other than 10 + left operand scale.


How about setting the `$scale` and `$roundMode` that I mentioned earlier in the 
constructor instead of `div` and `pow`? By doing so, we can use any scale when 
calculating with opcodes. If we want to specify a different scale for each 
calculation, you can do it using methods.
Or would it be more convenient to have a method to reset `$scale` and 
`$roundMode`? It is immutable, so when reset it returns a new instance.

Regards.

Saki

Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-02 Thread Jordan LeDoux
On Tue, Apr 2, 2024 at 5:05 PM Jordan LeDoux 
wrote:

>
>
> On Tue, Apr 2, 2024 at 4:50 PM Saki Takamachi  wrote:
>
>>
>> The two use cases at issue here are when the div and pow's exponent are
>> negative values. So how about allowing only these two methods to optionally
>> set `$scale` and `$roundMode` ?
>>
>> - The constructor takes only `$num` and always uses implicit scaling.
>> There is no option for the user to specify an arbitrary scale.
>> - `$scale`: If specified, use that value, otherwise use `10`. The scale
>> specified here is added to the scale of the left operand and used as the
>> scale of the result. In other words, `(new Number('0.01')->div('3', 2))`
>> results in `'0.0030' // scale = 2 + 2 = 4`.
>> - `$roundMode`: Specifies the rounding method when the result does not
>> fit within the scale. The initial value is `PHP_ROUND_TOWARD_ZERO`, which
>> matches the behavior of the BCMath function. That is, just truncate.
>> - If lucky enough to get the result within the scale, apply the implicit
>> scale to the result. In other words, if calculate `1 / 2`, the resulting
>> scale will be `1`, even if scale is `null` or specify a value such as `20`
>> for scale.
>> - The result of a calculation with operator overloading is the same as if
>> the option was not used when executing the method.
>>
>> However, I'm not sure if naming it `$scale` is appropriate.
>>
>> Also, since `BCMath\Number` is not made into a final class, there is a
>> possibility of implementing an inherited class in userland. Regarding this,
>> is it better to make the calculation method a final method, or to use a
>> function overridden by the user when executing from the opcode?
>>
>>
> The issue is that, presumably, this method will be used within the
> operator overload portion of the class entry in C. If it is allowed to be
> overridden, then this RFC is sort of providing a stealth operator overload
> to PHP developers. As much as I am for operator overloads having written an
> RFC for it, and as much as I find the arguments generally against it
> lacking, I am not in favor of doing it that way with a kind of... unspoken
> capability to overload the basic math operators in userland. I very much
> like the feature, but I also think it should be intentionally and
> specifically designed, which is why I spent a long time on it. I do not get
> a vote for RFCs, but I would vote against this if I could just for that
> reason IF the calculation methods were not private, the class was not
> final, AND the function entry was used in the operator overload.
>
> And operator overloads are also the place where what you outlined above
> gets murky. I think what you outlined is very close to a good final design
> for just the method usage side, but the operator usage side CANNOT provide
> a scale or a rounding mode. That should be taken into consideration,
> because allowing this object to be used with operators is probably the
> single largest benefit this RFC will provide to PHP developers.
>
> What I ended up doing was that the VALUE of the object was immutable, but
> the other information was not immutable. That has its own downsides, but
> does allow for very explicit control from the developer at the section of
> code using the class, but also avoids creating copies of the object or
> instantiating a new object for every single "setting" change during
> calculations.
>
> Jordan
>

I should clarify, the portion of your outline that I feel is not sufficient
for the operator overload use case is that there is no way to use both
operator overloads AND a scale other than 10 + left operand scale.

Jordan


Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-02 Thread Jordan LeDoux
On Tue, Apr 2, 2024 at 4:50 PM Saki Takamachi  wrote:

>
> The two use cases at issue here are when the div and pow's exponent are
> negative values. So how about allowing only these two methods to optionally
> set `$scale` and `$roundMode` ?
>
> - The constructor takes only `$num` and always uses implicit scaling.
> There is no option for the user to specify an arbitrary scale.
> - `$scale`: If specified, use that value, otherwise use `10`. The scale
> specified here is added to the scale of the left operand and used as the
> scale of the result. In other words, `(new Number('0.01')->div('3', 2))`
> results in `'0.0030' // scale = 2 + 2 = 4`.
> - `$roundMode`: Specifies the rounding method when the result does not fit
> within the scale. The initial value is `PHP_ROUND_TOWARD_ZERO`, which
> matches the behavior of the BCMath function. That is, just truncate.
> - If lucky enough to get the result within the scale, apply the implicit
> scale to the result. In other words, if calculate `1 / 2`, the resulting
> scale will be `1`, even if scale is `null` or specify a value such as `20`
> for scale.
> - The result of a calculation with operator overloading is the same as if
> the option was not used when executing the method.
>
> However, I'm not sure if naming it `$scale` is appropriate.
>
> Also, since `BCMath\Number` is not made into a final class, there is a
> possibility of implementing an inherited class in userland. Regarding this,
> is it better to make the calculation method a final method, or to use a
> function overridden by the user when executing from the opcode?
>
>
The issue is that, presumably, this method will be used within the operator
overload portion of the class entry in C. If it is allowed to be
overridden, then this RFC is sort of providing a stealth operator overload
to PHP developers. As much as I am for operator overloads having written an
RFC for it, and as much as I find the arguments generally against it
lacking, I am not in favor of doing it that way with a kind of... unspoken
capability to overload the basic math operators in userland. I very much
like the feature, but I also think it should be intentionally and
specifically designed, which is why I spent a long time on it. I do not get
a vote for RFCs, but I would vote against this if I could just for that
reason IF the calculation methods were not private, the class was not
final, AND the function entry was used in the operator overload.

And operator overloads are also the place where what you outlined above
gets murky. I think what you outlined is very close to a good final design
for just the method usage side, but the operator usage side CANNOT provide
a scale or a rounding mode. That should be taken into consideration,
because allowing this object to be used with operators is probably the
single largest benefit this RFC will provide to PHP developers.

What I ended up doing was that the VALUE of the object was immutable, but
the other information was not immutable. That has its own downsides, but
does allow for very explicit control from the developer at the section of
code using the class, but also avoids creating copies of the object or
instantiating a new object for every single "setting" change during
calculations.

Jordan


Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-02 Thread Saki Takamachi
Hi Barney, Jordan,

> I think that's a sufficiently different data type that it should be a 
> different class (if required), and probably a separate RFC, and for now it's 
> better to stay closer to the existing BCMath API.
> 
> Developers should be prepared to accept that an arbitrary precision decimal 
> can't represent 1/3 exactly, just like a binary float can't represent 1/10 
> exactly.

> Again, my experience on the issue is with the development of my own library 
> on the issue, however in my case I fully separated that kind of object into 
> its own class `Fraction`, and gave the kinds of operations we've been 
> discussing to the class `Decimal`. Storing numerators and denominators for as 
> long as possible involves a completely different set of math. For instance, 
> you need an algorithm to determine the Greatest Common Factor and the Least 
> Common Multiple in such a class, because there are a lot of places where you 
> would need to find the smallest common denominator or simplify the fraction.
> 
> Abstracting between the `Fraction` and `Decimal` so that they worked with 
> each other honestly introduced the most complex and inscrutable code in my 
> entire library, so unless fractions are themselves also a design goal of this 
> RFC, I would recommend against it.

> Having two classes `Fraction` and `Decimal` necessitated that I had a 
> `Number` class they both extended, as there are many situations where I would 
> want to type-hint "anything that calculation can be done on with arbitrary 
> precision" instead of specifically one or the other. I also provided the 
> `NumberInterface`, `DecimalInterface`, and `FractionInterface`, though I 
> don't think that would be necessary here as this is much more just a wrapper 
> for BCMath than an extension of it. The main goal of my library was not to 
> act as a wrapper for BCMath, it was to EXTEND BCMath with additional 
> capabilities, such as trigonometric functions that have arbitrary precision, 
> so keep that in mind when weighing input of mine that is referencing the work 
> I have done on this topic. The design goals were different.

Agree. I was a little too focused on precision and lost sight of the larger 
goal.


> In my library, if the scale is unspecified, I actually set the scale to 10 OR 
> the length of the input string, including integer decimals, whichever is 
> larger. Since I was designing my own library I could do things like that as 
> convention, and a scale of 10 is extremely fast, even with the horrifically 
> slow BCMath library, but covers most use cases (the overwhelmingly common of 
> which is exact calculation of money).
> 
> My library handles scale using the following design. It's not necessarily 
> correct here, as I was designing a PHP library instead of something for core, 
> AND my library does not have to deal with operator overloads so I'm always 
> working with method signatures instead, AND it's possible that my 
> class/method design is inferior to other alternatives, however it went:
> 
> 1. Each number constructor allowed for an optional input scale.
> 2. The input number was converted into the proper formatting from allowed 
> input types, and then the implicit scale is set to the total number of digits.
> 3. If the input scale was provided, the determined scale is set to that value.
> 4. Otherwise, the determined scale at construction is set to 10 or the 
> implicit scale of "number of digits", whichever is larger.
> 5. The class contained the `roundToScale` method, which allowed you to 
> provide the desired scale and the rounding method, and then would set the 
> determined scale to that value after rounding. It contained the `round` 
> method with the same parameters to allow rounding to a specific scale without 
> also setting the internal determined scale at the same time.
> 6. The class contained the `setScale` method which set the value of the 
> internal determined scale value to an int without mutating the value at all.
> 7. All mathematical operation methods which depended on scale, (such as div 
> or pow), allowed an optional input scale that would be used for calculation 
> if present. If it was not present, the internal calculations were done by 
> taking the higher of the determined scale between the two operands, and then 
> adding 2, and then the result was done by rounding using the default method 
> of ROUND_HALF_EVEN if no rounding method was provided.
> 
> Again, though I have spent a lot of design time on this issue for the math 
> library I developed, my library did not have to deal with the RFC process for 
> PHP or maintain consistency with the conventions of PHP core, only with the 
> conventions it set for itself. However, I can provide a link to the library 
> for reference on the issue if that would be helpful for people that are 
> contributing to the design aspects of this RFC.

The two use cases at issue here are when the div and pow's exponent are 
negative values. 

Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Ilija Tovilo
Hi Rowan

On Tue, Apr 2, 2024 at 10:10 PM Rowan Tommins [IMSoP]
 wrote:
>
> On 02/04/2024 01:17, Ilija Tovilo wrote:
>
> I'd like to introduce an idea I've played around with for a couple of
> weeks: Data classes, sometimes called structs in other languages (e.g.
> Swift and C#).
>
> I'm not sure if you've considered it already, but mutating methods should 
> probably be constrained to be void (or maybe "mutating" could occupy the 
> return type slot). Otherwise, someone is bound to write this:
>
> $start = new Location('Here');
> $end = $start->move!('There');
>
> Expecting it to mean this:
>
> $start = new Location('Here');
> $end = $start;
> $end->move!('There');
>
> When it would actually mean this:
>
> $start = new Location('Here');
> $start->move!('There');
> $end = $start;

I think there are some valid patterns for mutating methods with a
return value. For example, Set::add() might return a bool to indicate
whether the value was already present in the set.

> I seem to remember when this was discussed before, the argument being made 
> that separating value objects completely means you have to spend time 
> deciding how they interact with every feature of the language.

Data classes are classes with a single additional
zend_class_entry.ce_flags flag. So unless customized, they behave as
classes. This way, we have the option to tweak any behavior we would
like, but we don't need to.

Of course, this will still require an analysis of what behavior we
might want to tweak.

> Does the copy-on-write optimisation actually require the entire class to be 
> special, or could it be triggered by a mutating method on any object? To 
> allow direct modification of properties as well, we could move the call-site 
> marker slightly to a ->! operator:
>
> $foo->!mutate();
> $foo->!bar = 42;

I suppose this is possible, but it puts the burden for figuring out
what to separate onto the user. Consider this example, which would
work with the current approach:

$shapes[0]->position->zero!();

The left-hand-side of the mutating method call is fetched by
"read+write". Essentially, this ensures that any array or data class
is separated (copied if RC >1).

Without such a class-wide marker, you'll need to remember to add the
special syntax exactly where applicable.

$shapes![0]!->position!->zero();

In this case, $shapes, $shapes[0], and $shapes[0]->position must all
be separated. This seems very easy to mess up, especially since only
zero() is actually known to be separating and can thus be verified at
runtime.

> The main drawback I can see (outside of the implementation, which I can't 
> comment on) is that we couldn't overload the === operator to use value 
> semantics. In exchange, a lot of decisions would simply be made for us: they 
> would just be objects, with all the same behaviour around inheritance, 
> serialization, and so on.

Right, this would either require some other marker that switches to
this mode of comparison, or operator overloading.

Ilija


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Ilija Tovilo
Hi Niels

On Tue, Apr 2, 2024 at 8:16 PM Niels Dossche  wrote:
>
> On 02/04/2024 02:17, Ilija Tovilo wrote:
> > Hi everyone!
> >
> > I'd like to introduce an idea I've played around with for a couple of
> > weeks: Data classes, sometimes called structs in other languages (e.g.
> > Swift and C#).
>
> As already hinted in the thread, I also think inheritance may be dangerous in 
> a first version.
> I want to add to that: if you extend a data-class with a non-data-class, the 
> data-class behaviour gets lost, which is logical in a sense but also 
> surprised me in a way.

Yes, that's definitely not intended. I haven't implemented any
inheritance checks yet. But if inheritance is allowed, then it should
be restricted to classes of the same kind (by-ref or by-val).

> Also, FWIW, I'm not sure about the name "data" class, perhaps "value" class 
> or something alike is what people may be more familiar with wrt semantics, 
> although dataclass is also a known term.

I'm happy with value class, struct, record, data class, what have you.
I'll accept whatever the majority prefers.

> I do have a question about iterator behaviour. Consider this code:
> ```
> data class Test {
> public $a = 1;
> public $b = 2;
> }
>
> $test = new Test;
> foreach ($test as $k => &$v) {
> if ($k === "b")
> $test->a = $test;
> var_dump($k);
> }
> ```
>
> This will reset the iterator of the object on separation, so we will get an 
> infinite loop.
> Is this intended?
> If so, is it because the right hand side is the original object while the 
> left hand side gets the clone?
> Is this consistent with how arrays separate?

That's a good question. I have not really thought about iterators yet.
Modification of an array iterated by-reference does not restart the
iterator. Actually, by-reference capturing of the value also captures
the array by-reference, which is not completely intuitive.

My initial gut feeling is to handle data classes the same, i.e.
capture them by-reference when iterating the value by reference, so
that iteration is not restarted.

Ilija


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Ilija Tovilo
On Tue, Apr 2, 2024 at 9:43 PM Rowan Tommins [IMSoP]
 wrote:
>
> Similarly, if you discover a compromised key or signing account, you can look 
> for uses of that key or account, which might be a tiny number from a non-core 
> contributor; if you discover a compromised account pushing unsigned commits, 
> you have to audit every commit in the repository.

Right, that and what Jakub mentioned are fair arguments.

> I agree it's not a complete solution, but no security measure is; it's always 
> about reducing the attack surface or limiting the damage.

Right. That was the original intention of my e-mail: To point out that
we might also want to consider other mitigations. Not that we
shouldn't do commit signing.

Ilija


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Larry Garfield
On Tue, Apr 2, 2024, at 6:04 PM, Ilija Tovilo wrote:

>> What would be the reason not to?  As you indicated in another reply, the 
>> main reason some languages don't is to avoid large stack copies, but PHP 
>> doesn't have large stack copies for objects anyway so that's a non-issue.
>>
>> I've long argued that the fewer differences there are between service 
>> classes and data classes, the better, so I'm not sure what advantage this 
>> would have other than "ugh, inheritance is such a mess" (which is true, but 
>> that ship sailed long ago).
>
> One issue that just came to mind is object identity. For example:
>
> class Person {
> public function __construct(
> public string $firstname,
> public string $lastname,
> ) {}
> }
>
> class Manager extends Person {
> public function bossAround() {}
> }
>
> $person = new Person('Boss', 'Man');
> $manager = new Manager('Boss', 'Man');
> var_dump($person === $manager); // ???
>
> Equality for data objects is based on data, rather than the object
> handle. How does this interact with inheritance? Technically, Person
> and Manager represent the same data. Manager contains additional
> behavior, but does that change identity?
>
> I'm not sure what the answer is. That's just the first thing that came
> to mind. I'm confident we'll discover more such edge cases. Of course,
> I can invest the time to find the questions before deciding to
> disallow inheritance.

As Bruce already demonstrated, equality should include type, not just 
properties.  Even without inheritance that is necessary.

There may be good reason to omit inheritance, as we did on enums, but that 
shouldn't be the starting point.  (I'd have to research and see what other 
languages do. I think it's a mixed bag.)  We should try to ferret out those 
edge cases and see if there's reasonable solutions to them.

>> > * Mutating method calls on data classes use a slightly different
>> > syntax: `$vector->append!(42)`. All methods mutating `$this` must be
>> > marked as `mutating`. The reason for this is twofold: 1. It signals to
>> > the caller that the value is modified. 2. It allows `$vector` to be
>> > cloned before knowing whether the method `append` is modifying, which
>> > hugely reduces implementation complexity in the engine.
>>
>> As discussed in R11, it would be very beneficial if this marker could be on 
>> the method definition, not the method invocation.  You indicated that would 
>> be Hard(tm), but I think it's worth some effort to see if it's surmountably 
>> hard.  (Or at least less hard than just auto-detecting it, which you 
>> indicated is Extremely Hard(tm).)
>
> I think you misunderstood. The intention is to mark both call-site and
> declaration. Call-site is marked with ->method!(), while declaration
> is marked with "public mutating function". Call-site is required to
> avoid the engine complexity, as previously mentioned. But
> declaration-site is required so that the user (and IDEs) even know
> that you need to use the special syntax at the call-site.

Ah, OK.  That's... unfortunate, but I defer to you on the implementation 
complexity.

>> So to the extent there is a consensus, equality, stringifying, and a 
>> hashcode (which we don't have yet, but will need in the future for some 
>> things I suspect) seem to be the rough expected defaults.
>
> I'm just skeptical whether the default __toString() is ever useful. I
> can see an argument for it for quick debugging in languages that don't
> provide something like var_dump(). In PHP this seems much less useful.
> It's impossible to provide a default implementation that works
> everywhere (or pretty much anywhere, even).
>
> Equality is already included. Hashing should be added separately, and
> probably not just to data classes.

The equivalent of Python's __repr__ (which it auto-generates) would be 
__debugInfo().  Arguably its current output is what the default would likely be 
anyway, though.  I believe the typical auto-toString output is the same data, 
but presented in a more human-friendly way.  (So yes, mainly useful for 
debugging.)

Equality, well, we've already debated whether or not we should make that a 
general feature. :-)  Of note, though, in languages with equals(), it's also 
user-overridable.

>> > * In the future, it should be possible to allow using data classes in
>> > `SplObjectStorage`. However, because hashing is complex, this will be
>> > postponed to a separate RFC.

I believe this is where we would want/need a __hash() method or similar; Derick 
and I encountered that while researching collections in other languages.  
Leaving it out for now is fine, but it would be important for any future 
list-of functionality.

>> Would data class properties only be allowed to be other data classes, or 
>> could they hold a non-data class?  My knee jerk response is they should be 
>> data classes all the way down; the only counter-argument I can think of it 
>> would be how much existing code 

Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Deleu
On Tue, Apr 2, 2024 at 1:47 PM Larry Garfield 
wrote:

> > * Data classes protect from interior mutability. More concretely,
> > mutating nested data objects stored in a `readonly` property is not
> > legal, whereas it would be if they were ordinary objects.
> > * In the future, it should be possible to allow using data classes in
> > `SplObjectStorage`. However, because hashing is complex, this will be
> > postponed to a separate RFC.
>
> Would data class properties only be allowed to be other data classes, or
> could they hold a non-data class?  My knee jerk response is they should be
> data classes all the way down; the only counter-argument I can think of it
> would be how much existing code is out there that is a "data class" in all
> but name.  I still fear someone adding a DB connection object to a data
> class and everything going to hell, though. :-)
>

If there is a class made up of 90% data struct and 10% non-data struct, the
90% could be extracted into a true data struct and be referenced in the
existing regular class, making it even more organized in terms of
establishing what's "data" and what's "service". I would really favor
making it "data class" all the way down.

I understand you disagree with the argument against inheritance, but to me
the same logic applies here. Making it data class only allows for lifting
the restriction in the future, if necessary (requiring another RFC vote).
Making it mixed on version 1 means that support for the mixture of them can
never be undone.


-- 
Marco Deleu


Re: [PHP-DEV] First time contributor (DateTime::setDate PR)

2024-04-02 Thread Bilge

On 02/04/2024 14:14, Derick Rethans wrote:

Hi,

On Sun, 31 Mar 2024, Bilge wrote:


About the PR: I sometimes find it would be useful to only update part of the
date. The PR makes all parameters to DateTime(Immutable)::setDate
  optional in a
backwards-compatible manner such that we can elect to update only the day,
month, year or any combination of the three (thanks, in part, to named
parameters). Without this modification, we must always specify all of the day,
month and year parameters to change the date.

As I mentioned to you in Room 11, I am not in favour of adhoc API
changes to Date/Time classes. It has now been nearly 18 years since they
were originally introduced, and they indeed could do with an overhaul.

I have been colllecting ideas in
https://docs.google.com/document/d/1pxPSRbfATKE4TFWw72K3p7ir-02YQbTf3S3SIxOKWsk/edit#heading=h.2jol7kfhmijb

Having different/better modifiers would also be a good thing to talk
about, albeit perhaps on the four mentioned new classes, instead of
adding them to the already existing DateTime and DateTimeImmutable
classes.

In any case, just allowing setDate to be able to just modify the month
is going to introduce confusion, as this will be counter intuitive:

$dt = new DateTimeImmutable("2024-03-31");
$newDt = $dt->setDate( month: 2 );

It is now representing 2024-03-02.

This might be the right answer, but it might also be that the developer
just cared about the month part (and not the day-of-month), in which
case this is a WTF moment.

Picking mofication APIs is not as trivial as it seems, and I would like
to do it *right*.

Feel free to add comments and wishes to the google doc document. In the
near future, I will be writing up an RFC from this.

cheers,
Derick

Hi Derick,

Thanks for your reply!

Indeed, as per your code snippet, this is a WTF moment I had not 
accounted for and confirm the same result with my patch applied. 
Generally, my expectation here would be the month *must* be set to 2, so 
if the day portion will be invalidated by that change, we should either 
throw an exception or implicitly coerce it into range, i.e. (new 
DateTime("2024-03-31"))->setDate(month: 2); == 2024-02-29. However, I 
suppose this is not the conversation we're having as you do not wish to 
change this API at all, which I respect.


Regarding your brainstorm document, I can't understand much of it in its 
current state, and as I am not a subject matter expert, I think you will 
receive much better feedback from others. In particular, I cannot glean 
which four classes you are referring to in that document. Yet what I do 
find interesting is the notion of adding setters to DateTimeImmutable. 
For my particular use-case—producing a collection of dates incrementing 
by year in a Twig template—a trivial year setter would do just fine, 
with the significant caveat that it must implement fluent interface, 
because I need to call it in an expression context (returning a value), 
not a statement context (executing a void function separately). Not that 
Twig cannot execute statements, but it just becomes more verbose, 
cumbersome and less template-like.


If you were happy for me to add getters and fluent setters for year, I'd 
be happy to work on that PR, but for month we're back to the same 
problem outlined in the opening paragraph (and I suppose the same 
problem occasionally applies to year, if the day happens to be set to 
the leap day). Otherwise, I'll be happy to read over your RFC when it's 
ready.


Kind regards, Bilge


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Rowan Tommins [IMSoP]

On 02/04/2024 01:17, Ilija Tovilo wrote:

I'd like to introduce an idea I've played around with for a couple of
weeks: Data classes, sometimes called structs in other languages (e.g.
Swift and C#).



Hi Ilija,

I'm really interested to see how this develops. A couple of thoughts 
that immediately occurred to me...



I'm not sure if you've considered it already, but mutating methods 
should probably be constrained to be void (or maybe "mutating" could 
occupy the return type slot). Otherwise, someone is bound to write this:


$start = new Location('Here');
$end = $start->move!('There');

Expecting it to mean this:

$start = new Location('Here');
$end = $start;
$end->move!('There');

When it would actually mean this:

$start = new Location('Here');
$start->move!('There');
$end = $start;


I seem to remember when this was discussed before, the argument being 
made that separating value objects completely means you have to spend 
time deciding how they interact with every feature of the language.


Does the copy-on-write optimisation actually require the entire class to 
be special, or could it be triggered by a mutating method on any object? 
To allow direct modification of properties as well, we could move the 
call-site marker slightly to a ->! operator:


$foo->!mutate();
$foo->!bar = 42;

The first would be the same as your current version: it would perform a 
CoW reference separation / clone, then call the method, which would 
require a "mutating" marker. The second would essentially be an 
optimised version of $foo = clone $foo with [ 'bar' => 42 ]


During the method call or write operation, readonly properties would 
allow an additional write, as is the case in __clone and the "clone 
with" proposal. So a "pure" data object would simply be declared with 
the existing "readonly class" syntax.


The main drawback I can see (outside of the implementation, which I 
can't comment on) is that we couldn't overload the === operator to use 
value semantics. In exchange, a lot of decisions would simply be made 
for us: they would just be objects, with all the same behaviour around 
inheritance, serialization, and so on.



Regards,

--
Rowan Tommins
[IMSoP]


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Jakub Zelenka
On Tue, Apr 2, 2024 at 8:45 PM Rowan Tommins [IMSoP] 
wrote:

> On 02/04/2024 20:02, Ilija Tovilo wrote:
>
> But, does it matter? I'm not sure we look at some commits closer than
> others, based on its author. It's true that it might be easier to
> identify malicious commits if they all come from the same user, but it
> wouldn't prevent them.
>
>
> It's like the difference between stealing someone's credit card, and
> cloning the card of everyone who comes into the shop: in the first case,
> someone needs to check their credit card statements carefully; in the
> second, you'll have a hard job even working out who to contact.
>
> Similarly, if you discover a compromised key or signing account, you can
> look for uses of that key or account, which might be a tiny number from a
> non-core contributor; if you discover a compromised account pushing
> unsigned commits, you have to audit every commit in the repository.
>
> I agree it's not a complete solution, but no security measure is; it's
> always about reducing the attack surface or limiting the damage.
>

Nice comparison. Fully agree with that. I would add that potentially even
more important point than auditability is possibility to revoke access of
the compromised account as otherwise you can't easily identify such account
and prevent further issues.

Regards

Jakub


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Rowan Tommins [IMSoP]

On 02/04/2024 20:02, Ilija Tovilo wrote:

But, does it matter? I'm not sure we look at some commits closer than
others, based on its author. It's true that it might be easier to
identify malicious commits if they all come from the same user, but it
wouldn't prevent them.



It's like the difference between stealing someone's credit card, and 
cloning the card of everyone who comes into the shop: in the first case, 
someone needs to check their credit card statements carefully; in the 
second, you'll have a hard job even working out who to contact.


Similarly, if you discover a compromised key or signing account, you can 
look for uses of that key or account, which might be a tiny number from 
a non-core contributor; if you discover a compromised account pushing 
unsigned commits, you have to audit every commit in the repository.


I agree it's not a complete solution, but no security measure is; it's 
always about reducing the attack surface or limiting the damage.


Regards,

--
Rowan Tommins
[IMSoP]


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Rob Landers
On Tue, Apr 2, 2024, at 20:51, Bruce Weirdan wrote:
> On Tue, Apr 2, 2024 at 8:05 PM Ilija Tovilo  wrote:
> 
> > Equality for data objects is based on data, rather than the object
> > handle.
> 
> I believe equality should always consider the type of the object.
> 
> ```php
> new Problem(size:'big') === new Universe(size:'big')
> && new Problem(size:'big') === new Shoe(size:'big');
> ```
> 
> If the above can ever be true then I'm not sure how big is the problem
> (but probably very big).
> Also see the examples of non-comparable ids - `new CompanyId(1)`
> should not be equal to `new PersonId(1)`
> 
> And I'd find it very confusing if the following crashed
> 
> ```php
> function f(Universe $_u): void {}
> $universe = new Universe(size:'big');
> $shoe = new Shoe(size:'big);
> 
> if ($shoe === $universe) {
>f($shoe); // shoe is *identical* to the universe, so it should be
> accepted wherever the universe is
> }
> ```
> 
> -- 
>   Best regards,
>   Bruce Weirdan 
> mailto:weir...@gmail.com
> 

I'd love to see it so that equality was more like == for regular objects. If 
the type matches and the data matches, it's true. It'd be really helpful to be 
able to downcast types though. Such as in my user id example I gave earlier. 
Once it reaches a certain point in the code, it doesn't matter that it was once 
a UserId, it just matters that it is currently an Id.

Now that I think about it, decoration might be better than inheritance here and 
inheritance might make more sense to be banned. In other words, this might be 
just as simple and easy to use:

data class Id {
  public function __construct(public string $id) {}
}

data class UserId {
  public function __construct(public Id $id) {}
}

Though it would be really interesting to use them as "traits" for each other to 
say "this data class can be converted to another type, but information will be 
lost" where they are 100% separate types but can be "cast" to specified types.

// "use" has all the same rules as extends, but,
// UserId is not an Id; it can be converted to an Id
data class UserId use Id {
  public function __construct(public string $id, public string $name) {}
}

$user = new UserId('123', 'rob');

$id = (Id) $user;

$user !== $id === true;

$id is 100% Id and lost all its "userness." Hmm. Interesting indeed. Probably 
not practical, but interesting.

— Rob

Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Ilija Tovilo
Hi Rowan

On Tue, Apr 2, 2024 at 8:48 PM Rowan Tommins [IMSoP]
 wrote:
>
> In fact, you don't need to compromise anybody's key: you could socially 
> engineer a situation where you have push access to the repository, or break 
> the security in some other way. As I understand it, this is exactly what 
> happened 3 years ago: someone gained direct write access to the git.php.net 
> server, and added commits "authored by" Nikita and others to the history in 
> the repository.

Right, but I would like to believe that attaining push access _without
gaining access to a maintainers account_ should be substantially
harder on GitHub than our self-hosted git server. :)

> If all commits are signed, a compromised key or account can only be used to 
> sign commits with that specific identity: your GitHub account can't be used 
> to sign commits as Derick or Nikita, only as you. The impact is limited to 
> one identity, not the integrity of the entire repository.

But, does it matter? I'm not sure we look at some commits closer than
others, based on its author. It's true that it might be easier to
identify malicious commits if they all come from the same user, but it
wouldn't prevent them.

To be clear: I'm not against commit signing, I've been doing it for
years. I'm just unsure if it's a sufficient solution (apart from
releases, which are a whole different can of worms).

Ilija


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Bruce Weirdan
On Tue, Apr 2, 2024 at 8:05 PM Ilija Tovilo  wrote:

> Equality for data objects is based on data, rather than the object
> handle.

I believe equality should always consider the type of the object.

```php
new Problem(size:'big') === new Universe(size:'big')
&& new Problem(size:'big') === new Shoe(size:'big');
```

If the above can ever be true then I'm not sure how big is the problem
(but probably very big).
Also see the examples of non-comparable ids - `new CompanyId(1)`
should not be equal to `new PersonId(1)`

And I'd find it very confusing if the following crashed

```php
function f(Universe $_u): void {}
$universe = new Universe(size:'big');
$shoe = new Shoe(size:'big);

if ($shoe === $universe) {
   f($shoe); // shoe is *identical* to the universe, so it should be
accepted wherever the universe is
}
```

-- 
  Best regards,
  Bruce Weirdan mailto:weir...@gmail.com


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Rowan Tommins [IMSoP]

On 02/04/2024 18:27, Ilija Tovilo wrote:

If your GitHub account is compromised,
[...] the attacker may simply register their
own gpg key in your account, with the commits appearing as verified.

If your ssh key is compromised instead, and you use ssh to sign your
commits, the attacker may sign their malicious commits with that same
key they may use to push.



The key point (pun not intended) is that git doesn't record who pushed a 
commit - pushing is just data synchronization, not part of the history. 
What it records is who "authored" the commit, and by default that's just 
plain text; so if somebody compromises an SSH key or access token 
authorised to your GitHub account, they can push commits "authored by" 
Derick, or Nikita, or Bill Gates, and there is no way to tell them apart 
from the real thing.


In fact, you don't need to compromise anybody's key: you could socially 
engineer a situation where you have push access to the repository, or 
break the security in some other way. As I understand it, this is 
exactly what happened 3 years ago: someone gained direct write access to 
the git.php.net server, and added commits "authored by" Nikita and 
others to the history in the repository.


If all commits are signed, a compromised key or account can only be used 
to sign commits with that specific identity: your GitHub account can't 
be used to sign commits as Derick or Nikita, only as you. The impact is 
limited to one identity, not the integrity of the entire repository.


Regards,

--
Rowan Tommins
[IMSoP]


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-04-02 Thread Jakub Zelenka
Hi,

On Tue, Apr 2, 2024 at 7:14 PM Stanislav Malyshev 
wrote:

> Hi!
>
> That is something PHP is missing atm, no one can verify the build process
>> for releases.
>>
>
> Yes that's what I was suggesting. This should be done by RM. In that way,
> the RM becomes more someone that verifies the build and not the actual
> person that provides the build.
>
> I'm not sure though how the RM can really verify it. I mean, we have the
> tar blob that comes from the git repo - which we assume is legit. We also
> have some files that aren't in the repo. If RM builds them by themselves
> then the question comes up what if RM's environment is compromised and
> something bad is injected. If RM receives the files from outside source,
> how the RM verifies they are genuine?  I don't think reading through the
> whole "configure" file and verifying it's not bad is realistic for any
> person. And from what I understand, "configure" and such are quite
> environment-dependant, so you can't just have a standard hash to compare
> to. You can't have the RM to just run "buildconf" again and do hash check
> because they may get different bits than the ones coming from the outside,
> like CI. I dunno, maybe if we had some kind of Docker image for generating
> it that would produce reproducible result, that'd be possible? Otherwise I
> am still not sure how the verification procedure looks like.
>

Yeah as I already noted that it needs to be reproducible so the RM would
need to have exactly the same version of all build tools as used in CI. I
think the only option would be to use Docker image for that. We could then
use the same image in CI (job container). In such way we should be able to
implement the same process (there might some extra bits to do but I think
it should be doable in general). We could potentially store the produced
hashes to some CI artifact and possibly also make it available from the
downloads server (once downloaded from CI) so the RM could have a script
that just automatically compare all hashes. So the ideal scenario would be
that RM just runs a command that will do all for them.


> Right now as I understand we're simply trusting the RM that they have
> uncompromised environment and third parties have no way to verify it's the
> case. But I guess it's time we do better?
>

Yes exactly that. Currently the RM can change the build as they want so if
they are compromised, then we might have the same issue that happened to XZ.

Regards

Jakub


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Niels Dossche
On 02/04/2024 02:17, Ilija Tovilo wrote:
> Hi everyone!
> 
> I'd like to introduce an idea I've played around with for a couple of
> weeks: Data classes, sometimes called structs in other languages (e.g.
> Swift and C#).
> 
> In a nutshell, data classes are classes with value semantics.
> Instances of data classes are implicitly copied when assigned to a
> variable, or when passed to a function. When the new instance is
> modified, the original instance remains untouched. This might sound
> familiar: It's exactly how arrays work in PHP.
> 
> ```php
> $a = [1, 2, 3];
> $b = $a;
> $b[] = 4;
> var_dump($a); // [1, 2, 3]
> var_dump($b); // [1, 2, 3, 4]
> ```
> 
> You may think that copying the array on each assignment is expensive,
> and you would be right. PHP uses a trick called copy-on-write, or CoW
> for short. `$a` and `$b` actually share the same array until `$b[] =
> 4;` modifies it. It's only at this point that the array is copied and
> replaced in `$b`, so that the modification doesn't affect `$a`. As
> long as a variable is the sole owner of a value, or none of the
> variables modify the value, no copy is needed. Data classes use the
> same mechanism.
> 
> But why value semantics in the first place? There are two major flaws
> with by-reference semantics for data structures:
> 
> 1. It's very easy to forget cloning data that is referenced somewhere
> else before modifying it. This will lead to "spooky actions at a
> distance". Having recently used JavaScript (where all data structures
> have by-reference semantics) for an educational IR optimizer,
> accidental mutations of shared arrays/maps/sets were my primary source
> of bugs.
> 2. Defensive cloning (to avoid issue 1) will lead to useless work when
> the value is not referenced anywhere else.
> 
> PHP offers readonly properties and classes to address issue 1.
> However, they further promote issue 2 by making it impossible to
> modify values without cloning them first, even if we know they are not
> referenced anywhere else. Some APIs further exacerbate the issue by
> requiring multiple copies for multiple modifications (e.g.
> `$response->withStatus(200)->withHeader('X-foo', 'foo');`).
> 
> As you may have noticed, arrays already solve both of these issues
> through CoW. Data classes allow implementing arbitrary data structures
> with the same value semantics in core, extensions or userland. For
> example, a `Vector` data class may look something like the following:
> 
> ```php
> data class Vector {
> private $values;
> 
> public function __construct(...$values) {
> $this->values = $values;
> }
> 
> public mutating function append($value) {
> $this->values[] = $value;
> }
> }
> 
> $a = new Vector(1, 2, 3);
> $b = $a;
> $b->append!(4);
> var_dump($a); // Vector(1, 2, 3)
> var_dump($b); // Vector(1, 2, 3, 4)
> ```
> 
> An internal Vector implementation might offer a faster and stricter
> alternative to arrays (e.g. Vector from php-ds).
> 
> Some other things to note about data classes:
> 
> * Data classes are ordinary classes, and as such may implement
> interfaces, methods and more. I have not decided whether they should
> support inheritance.
> * Mutating method calls on data classes use a slightly different
> syntax: `$vector->append!(42)`. All methods mutating `$this` must be
> marked as `mutating`. The reason for this is twofold: 1. It signals to
> the caller that the value is modified. 2. It allows `$vector` to be
> cloned before knowing whether the method `append` is modifying, which
> hugely reduces implementation complexity in the engine.
> * Data classes customize identity (`===`) comparison, in the same way
> arrays do. Two data objects are identical if all their properties are
> identical (including order for dynamic properties).
> * Sharing data classes by-reference is possible using references, as
> you would for arrays.
> * We may decide to auto-implement `__toString` for data classes,
> amongst other things. I am still undecided whether this is useful for
> PHP.
> * Data classes protect from interior mutability. More concretely,
> mutating nested data objects stored in a `readonly` property is not
> legal, whereas it would be if they were ordinary objects.
> * In the future, it should be possible to allow using data classes in
> `SplObjectStorage`. However, because hashing is complex, this will be
> postponed to a separate RFC.
> 
> One known gotcha is that we cannot trivially enforce placement of
> `modfying` on methods without a performance hit. It is the
> responsibility of the user to correctly mark such methods.
> 
> Here's a fully functional PoC, excluding JIT:
> https://github.com/php/php-src/pull/13800
> 
> Let me know what you think. I will start working on an RFC draft once
> work on property hooks concludes.
> 
> Ilija

Hi Ilija

Thank you for this proposal, I like the idea of having value semantic objects 
available.
I pulled your branch and played with it a bit.

As already hinted in 

Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-04-02 Thread Stanislav Malyshev

Hi!



That is something PHP is missing atm, no one can verify the build
process for releases.


Yes that's what I was suggesting. This should be done by RM. In that 
way, the RM becomes more someone that verifies the build and not the 
actual person that provides the build.


I'm not sure though how the RM can really verify it. I mean, we have the 
tar blob that comes from the git repo - which we assume is legit. We 
also have some files that aren't in the repo. If RM builds them by 
themselves then the question comes up what if RM's environment is 
compromised and something bad is injected. If RM receives the files from 
outside source, how the RM verifies they are genuine?  I don't think 
reading through the whole "configure" file and verifying it's not bad is 
realistic for any person. And from what I understand, "configure" and 
such are quite environment-dependant, so you can't just have a standard 
hash to compare to. You can't have the RM to just run "buildconf" again 
and do hash check because they may get different bits than the ones 
coming from the outside, like CI. I dunno, maybe if we had some kind of 
Docker image for generating it that would produce reproducible result, 
that'd be possible? Otherwise I am still not sure how the verification 
procedure looks like.


Right now as I understand we're simply trusting the RM that they have 
uncompromised environment and third parties have no way to verify it's 
the case. But I guess it's time we do better?


Thanks,

Stas


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Ilija Tovilo
Hi Larry

On Tue, Apr 2, 2024 at 5:31 PM Larry Garfield  wrote:
>
> On Tue, Apr 2, 2024, at 12:17 AM, Ilija Tovilo wrote:
> > Hi everyone!
> >
> > I'd like to introduce an idea I've played around with for a couple of
> > weeks: Data classes, sometimes called structs in other languages (e.g.
> > Swift and C#).
> >
> > * Data classes are ordinary classes, and as such may implement
> > interfaces, methods and more. I have not decided whether they should
> > support inheritance.
>
> What would be the reason not to?  As you indicated in another reply, the main 
> reason some languages don't is to avoid large stack copies, but PHP doesn't 
> have large stack copies for objects anyway so that's a non-issue.
>
> I've long argued that the fewer differences there are between service classes 
> and data classes, the better, so I'm not sure what advantage this would have 
> other than "ugh, inheritance is such a mess" (which is true, but that ship 
> sailed long ago).

One issue that just came to mind is object identity. For example:

class Person {
public function __construct(
public string $firstname,
public string $lastname,
) {}
}

class Manager extends Person {
public function bossAround() {}
}

$person = new Person('Boss', 'Man');
$manager = new Manager('Boss', 'Man');
var_dump($person === $manager); // ???

Equality for data objects is based on data, rather than the object
handle. How does this interact with inheritance? Technically, Person
and Manager represent the same data. Manager contains additional
behavior, but does that change identity?

I'm not sure what the answer is. That's just the first thing that came
to mind. I'm confident we'll discover more such edge cases. Of course,
I can invest the time to find the questions before deciding to
disallow inheritance.

> > * Mutating method calls on data classes use a slightly different
> > syntax: `$vector->append!(42)`. All methods mutating `$this` must be
> > marked as `mutating`. The reason for this is twofold: 1. It signals to
> > the caller that the value is modified. 2. It allows `$vector` to be
> > cloned before knowing whether the method `append` is modifying, which
> > hugely reduces implementation complexity in the engine.
>
> As discussed in R11, it would be very beneficial if this marker could be on 
> the method definition, not the method invocation.  You indicated that would 
> be Hard(tm), but I think it's worth some effort to see if it's surmountably 
> hard.  (Or at least less hard than just auto-detecting it, which you 
> indicated is Extremely Hard(tm).)

I think you misunderstood. The intention is to mark both call-site and
declaration. Call-site is marked with ->method!(), while declaration
is marked with "public mutating function". Call-site is required to
avoid the engine complexity, as previously mentioned. But
declaration-site is required so that the user (and IDEs) even know
that you need to use the special syntax at the call-site.

> So to the extent there is a consensus, equality, stringifying, and a hashcode 
> (which we don't have yet, but will need in the future for some things I 
> suspect) seem to be the rough expected defaults.

I'm just skeptical whether the default __toString() is ever useful. I
can see an argument for it for quick debugging in languages that don't
provide something like var_dump(). In PHP this seems much less useful.
It's impossible to provide a default implementation that works
everywhere (or pretty much anywhere, even).

Equality is already included. Hashing should be added separately, and
probably not just to data classes.

> > * In the future, it should be possible to allow using data classes in
> > `SplObjectStorage`. However, because hashing is complex, this will be
> > postponed to a separate RFC.
>
> Would data class properties only be allowed to be other data classes, or 
> could they hold a non-data class?  My knee jerk response is they should be 
> data classes all the way down; the only counter-argument I can think of it 
> would be how much existing code is out there that is a "data class" in all 
> but name.  I still fear someone adding a DB connection object to a data class 
> and everything going to hell, though. :-)

Disallowing ordinary by-ref objects is not trivial without additional
performance penalties, and I don't see a good reason for it. Can you
provide an example on when that would be problematic?

Ilija


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Larry Garfield
On Tue, Apr 2, 2024, at 5:27 PM, Ilija Tovilo wrote:
> Hi Derick
>
> On Tue, Apr 2, 2024 at 4:15 PM Derick Rethans  wrote:
>>
>> What do y'all think about requiring GPG signed commits for the php-src
>> repository?
>
> Let me repost my internal response for visibility.
>
> I'm currently struggling to understand what kind of attack signing
> commits prevents.
>
> If your GitHub account is compromised, GitHub allows the attacker to
> commit via web interface and will happily sign their commits with a
> gpg key auto-generated for your account.
>
> See: 
> https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification
>
>> GitHub will automatically use GPG to sign commits you make using the web 
>> interface. Commits signed by GitHub will have a verified status. You can 
>> verify the signature locally using the public key available at 
>> https://github.com/web-flow.gpg.
>
> Even if this wasn't the case, the attacker may simply register their
> own gpg key in your account, with the commits appearing as verified.
>
> If your ssh key is compromised instead, and you use ssh to sign your
> commits, the attacker may sign their malicious commits with that same
> key they may use to push.
>
> The only thing this really seems to prevent is pushing commits via a
> compromised ssh key, while commits need to be signed with gpg. If
> that's the intention, we should require using gpg rather than ssh for
> signing (or using a different ssh key, I suppose). Additionally, it
> may help for people who push via HTTP+auth token, but that's probably
> not advisable in the first place.
>
> Something that may also help is restricting pushes to patch branches
> (PHP-x.y.z) to release managers. These branches are not commonly
> looked at by the public, and so it may be easier to sneak malicious
> commits into them.
>
> In addition, we should keep GitHub privileges narrow, especially
> branch protection configuration.
>
> As mentioned by others, this does not prevent the xz issue. But paired
> with an auto-deployment solution, it could definitely help. It would
> be even better if release managers cannot change CI, and CI
> maintainers cannot create releases, as this essentially enforces the
> 4-eyes principle. The former may be hard to enforce, as CI lives in
> the same repository.
>
> Another solution might be to require PRs, and PR verifications. But
> this will inevitably create overhead for maintainers.
>
> Ilija

Coming from corporate projects at the moment, I always hard-block pushing 
straight to the master branch.  Everything goes through a PR, and has to be 
approved by someone other than the author, guaranteeing 4 eyes for every line 
of code.  And that's for internal backend services.

It's always struck me as mind-boggling that a project the size of PHP doesn't 
do that.  Yes, it's a little more overhead, but with the larger team we now 
have (thanks to the Foundation) I believe the human-security checks it gives us 
are well worth it.  (And just from a technical standpoint, even the best 
developer goofs up and needs their code reviewed by someone.)

I have no particular input on the code signing front, other than please have 
clear documentation to follow for someone setting it up for the first time as 
GPG has always been a UX nightmare. :-)

--Larry Garfield


Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-02 Thread Jordan LeDoux
On Tue, Apr 2, 2024 at 10:24 AM Jordan LeDoux 
wrote:

>
>
> On Tue, Apr 2, 2024 at 3:12 AM Lynn  wrote:
>
>>
>> I'm inexperienced when it comes to maths and the precision here, but I do
>> have some experience when it comes to what the business I work for wants.
>> I've implemented BCMath in a couple of places where this kind of precision
>> is necessary, and I found that whenever I do divisions I prefer having at
>> least 2 extra digits. Would it make sense to internally always just store a
>> more accurate number? For things like
>> additions/multiplications/subtractions it could always use the highest
>> precision, and then for divisions add like +3~6 or something. Whenever you
>> have numbers that have a fraction like `10.5001` it makes sense to set it
>> to 4, but when you have `10` it suddenly becomes 0 when implicitly setting
>> it.
>>
>> For the following examples assume each number is a BcNum:
>> When doing something like `10 * 10. * 10.0` I want the end
>> result to have a precision of at least 9 so I don't lose information. When
>> I do `((10 / 3) * 100) * 2` I don't want it to implicitly become 0, because
>> the precision here is important to me. I don't think using infinite
>> precision here is a reasonable approach either. I'm not sure what the
>> correct answer is, perhaps it's just "always manually set the precision"?
>>
>
> In my library, if the scale is unspecified, I actually set the scale to 10
> OR the length of the input string, including integer decimals, whichever is
> larger. Since I was designing my own library I could do things like that as
> convention, and a scale of 10 is extremely fast, even with the horrifically
> slow BCMath library, but covers most use cases (the overwhelmingly common
> of which is exact calculation of money).
>
> My library handles scale using the following design. It's not necessarily
> correct here, as I was designing a PHP library instead of something for
> core, AND my library does not have to deal with operator overloads so I'm
> always working with method signatures instead, AND it's possible that my
> class/method design is inferior to other alternatives, however it went:
>
> 1. Each number constructor allowed for an optional input scale.
> 2. The input number was converted into the proper formatting from allowed
> input types, and then the implicit scale is set to the total number of
> digits.
> 3. If the input scale was provided, the determined scale is set to that
> value.
> 4. Otherwise, the determined scale at construction is set to 10 or the
> implicit scale of "number of digits", whichever is larger.
> 5. The class contained the `roundToScale` method, which allowed you to
> provide the desired scale and the rounding method, and then would set the
> determined scale to that value after rounding. It contained the `round`
> method with the same parameters to allow rounding to a specific scale
> without also setting the internal determined scale at the same time.
> 6. The class contained the `setScale` method which set the value of the
> internal determined scale value to an int without mutating the value at all.
> 7. All mathematical operation methods which depended on scale, (such as
> div or pow), allowed an optional input scale that would be used for
> calculation if present. If it was not present, the internal calculations
> were done by taking the higher of the determined scale between the two
> operands, and then adding 2, and then the result was done by rounding using
> the default method of ROUND_HALF_EVEN if no rounding method was provided.
>
> Again, though I have spent a lot of design time on this issue for the math
> library I developed, my library did not have to deal with the RFC process
> for PHP or maintain consistency with the conventions of PHP core, only with
> the conventions it set for itself. However, I can provide a link to the
> library for reference on the issue if that would be helpful for people that
> are contributing to the design aspects of this RFC.
>
> > The current assumption is that a Number always holds a single value. How
> if we made it so that it held two values? They are the numerator and the
> denominator.
>
> Again, my experience on the issue is with the development of my own
> library on the issue, however in my case I fully separated that kind of
> object into its own class `Fraction`, and gave the kinds of operations
> we've been discussing to the class `Decimal`. Storing numerators and
> denominators for as long as possible involves a completely different set of
> math. For instance, you need an algorithm to determine the Greatest Common
> Factor and the Least Common Multiple in such a class, because there are a
> lot of places where you would need to find the smallest common denominator
> or simplify the fraction.
>
> Abstracting between the `Fraction` and `Decimal` so that they worked with
> each other honestly introduced the most complex and inscrutable code in my
> entire library, 

Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Ilija Tovilo
Hi Derick

On Tue, Apr 2, 2024 at 4:15 PM Derick Rethans  wrote:
>
> What do y'all think about requiring GPG signed commits for the php-src
> repository?

Let me repost my internal response for visibility.

I'm currently struggling to understand what kind of attack signing
commits prevents.

If your GitHub account is compromised, GitHub allows the attacker to
commit via web interface and will happily sign their commits with a
gpg key auto-generated for your account.

See: 
https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification

> GitHub will automatically use GPG to sign commits you make using the web 
> interface. Commits signed by GitHub will have a verified status. You can 
> verify the signature locally using the public key available at 
> https://github.com/web-flow.gpg.

Even if this wasn't the case, the attacker may simply register their
own gpg key in your account, with the commits appearing as verified.

If your ssh key is compromised instead, and you use ssh to sign your
commits, the attacker may sign their malicious commits with that same
key they may use to push.

The only thing this really seems to prevent is pushing commits via a
compromised ssh key, while commits need to be signed with gpg. If
that's the intention, we should require using gpg rather than ssh for
signing (or using a different ssh key, I suppose). Additionally, it
may help for people who push via HTTP+auth token, but that's probably
not advisable in the first place.

Something that may also help is restricting pushes to patch branches
(PHP-x.y.z) to release managers. These branches are not commonly
looked at by the public, and so it may be easier to sneak malicious
commits into them.

In addition, we should keep GitHub privileges narrow, especially
branch protection configuration.

As mentioned by others, this does not prevent the xz issue. But paired
with an auto-deployment solution, it could definitely help. It would
be even better if release managers cannot change CI, and CI
maintainers cannot create releases, as this essentially enforces the
4-eyes principle. The former may be hard to enforce, as CI lives in
the same repository.

Another solution might be to require PRs, and PR verifications. But
this will inevitably create overhead for maintainers.

Ilija


Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-02 Thread Jordan LeDoux
On Tue, Apr 2, 2024 at 3:12 AM Lynn  wrote:

>
> I'm inexperienced when it comes to maths and the precision here, but I do
> have some experience when it comes to what the business I work for wants.
> I've implemented BCMath in a couple of places where this kind of precision
> is necessary, and I found that whenever I do divisions I prefer having at
> least 2 extra digits. Would it make sense to internally always just store a
> more accurate number? For things like
> additions/multiplications/subtractions it could always use the highest
> precision, and then for divisions add like +3~6 or something. Whenever you
> have numbers that have a fraction like `10.5001` it makes sense to set it
> to 4, but when you have `10` it suddenly becomes 0 when implicitly setting
> it.
>
> For the following examples assume each number is a BcNum:
> When doing something like `10 * 10. * 10.0` I want the end
> result to have a precision of at least 9 so I don't lose information. When
> I do `((10 / 3) * 100) * 2` I don't want it to implicitly become 0, because
> the precision here is important to me. I don't think using infinite
> precision here is a reasonable approach either. I'm not sure what the
> correct answer is, perhaps it's just "always manually set the precision"?
>

In my library, if the scale is unspecified, I actually set the scale to 10
OR the length of the input string, including integer decimals, whichever is
larger. Since I was designing my own library I could do things like that as
convention, and a scale of 10 is extremely fast, even with the horrifically
slow BCMath library, but covers most use cases (the overwhelmingly common
of which is exact calculation of money).

My library handles scale using the following design. It's not necessarily
correct here, as I was designing a PHP library instead of something for
core, AND my library does not have to deal with operator overloads so I'm
always working with method signatures instead, AND it's possible that my
class/method design is inferior to other alternatives, however it went:

1. Each number constructor allowed for an optional input scale.
2. The input number was converted into the proper formatting from allowed
input types, and then the implicit scale is set to the total number of
digits.
3. If the input scale was provided, the determined scale is set to that
value.
4. Otherwise, the determined scale at construction is set to 10 or the
implicit scale of "number of digits", whichever is larger.
5. The class contained the `roundToScale` method, which allowed you to
provide the desired scale and the rounding method, and then would set the
determined scale to that value after rounding. It contained the `round`
method with the same parameters to allow rounding to a specific scale
without also setting the internal determined scale at the same time.
6. The class contained the `setScale` method which set the value of the
internal determined scale value to an int without mutating the value at all.
7. All mathematical operation methods which depended on scale, (such as div
or pow), allowed an optional input scale that would be used for calculation
if present. If it was not present, the internal calculations were done by
taking the higher of the determined scale between the two operands, and
then adding 2, and then the result was done by rounding using the default
method of ROUND_HALF_EVEN if no rounding method was provided.

Again, though I have spent a lot of design time on this issue for the math
library I developed, my library did not have to deal with the RFC process
for PHP or maintain consistency with the conventions of PHP core, only with
the conventions it set for itself. However, I can provide a link to the
library for reference on the issue if that would be helpful for people that
are contributing to the design aspects of this RFC.

> The current assumption is that a Number always holds a single value. How
if we made it so that it held two values? They are the numerator and the
denominator.

Again, my experience on the issue is with the development of my own library
on the issue, however in my case I fully separated that kind of object into
its own class `Fraction`, and gave the kinds of operations we've been
discussing to the class `Decimal`. Storing numerators and denominators for
as long as possible involves a completely different set of math. For
instance, you need an algorithm to determine the Greatest Common Factor and
the Least Common Multiple in such a class, because there are a lot of
places where you would need to find the smallest common denominator or
simplify the fraction.

Abstracting between the `Fraction` and `Decimal` so that they worked with
each other honestly introduced the most complex and inscrutable code in my
entire library, so unless fractions are themselves also a design goal of
this RFC, I would recommend against it.

Jordan


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Jakub Zelenka
On Tue, Apr 2, 2024 at 5:05 PM John Coggeshall  wrote:

>
> So if we want to make sure that something like XY doesn't happen, we
> have to add some additional restrictions to those GPG keys.
>
>
> Looks like all those geeky colleagues of ours back in the day having
> key-signing parties at conferences were on to something, maybe..
>
> Let's be clear about something -- having GPG key requirements isn't going
> to help a situation like XZ. The XZ attack was done by an active maintainer
> of the project (who arguably manipulated the original maintainer of the
> project to become a maintainer themselves). It was as much a social
> engineering attack as anything.
>
> Having GPG key requirements is all fine and dandy I suppose, but my
> tongue-in-cheek comment above has a real point behind it: GPG keys don't
> mean jack if you can't trust who owns the key. Unless we want to start
> limiting contributors to people who show up at conferences to do key
> signings of their GPG keys, I question exactly what this buys the project
> other than an illusion of security and additional complexity? I couldn't
> even *really*  trust Derick to read me his GPG public key
> character-by-character over the phone now days thanks to AI.
>
>
It's not meant to prevent XZ attack. The purpose is really just for the
actual contributors to have some assurance that just some random person
won't commit anything in their name just by changing the author of the
commit.

See another thread [1] about prevention of the XZ attack - that basically
requires moving the actual build to the CI and have the right process to
verify it.

[1] https://externals.io/message/122811

Regards

Jakub


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Robert Landers
On Tue, Apr 2, 2024 at 2:20 AM Ilija Tovilo  wrote:
>
> Hi everyone!
>
> I'd like to introduce an idea I've played around with for a couple of
> weeks: Data classes, sometimes called structs in other languages (e.g.
> Swift and C#).
>
> In a nutshell, data classes are classes with value semantics.
> Instances of data classes are implicitly copied when assigned to a
> variable, or when passed to a function. When the new instance is
> modified, the original instance remains untouched. This might sound
> familiar: It's exactly how arrays work in PHP.
>
> ```php
> $a = [1, 2, 3];
> $b = $a;
> $b[] = 4;
> var_dump($a); // [1, 2, 3]
> var_dump($b); // [1, 2, 3, 4]
> ```
>
> You may think that copying the array on each assignment is expensive,
> and you would be right. PHP uses a trick called copy-on-write, or CoW
> for short. `$a` and `$b` actually share the same array until `$b[] =
> 4;` modifies it. It's only at this point that the array is copied and
> replaced in `$b`, so that the modification doesn't affect `$a`. As
> long as a variable is the sole owner of a value, or none of the
> variables modify the value, no copy is needed. Data classes use the
> same mechanism.
>
> But why value semantics in the first place? There are two major flaws
> with by-reference semantics for data structures:
>
> 1. It's very easy to forget cloning data that is referenced somewhere
> else before modifying it. This will lead to "spooky actions at a
> distance". Having recently used JavaScript (where all data structures
> have by-reference semantics) for an educational IR optimizer,
> accidental mutations of shared arrays/maps/sets were my primary source
> of bugs.
> 2. Defensive cloning (to avoid issue 1) will lead to useless work when
> the value is not referenced anywhere else.
>
> PHP offers readonly properties and classes to address issue 1.
> However, they further promote issue 2 by making it impossible to
> modify values without cloning them first, even if we know they are not
> referenced anywhere else. Some APIs further exacerbate the issue by
> requiring multiple copies for multiple modifications (e.g.
> `$response->withStatus(200)->withHeader('X-foo', 'foo');`).
>
> As you may have noticed, arrays already solve both of these issues
> through CoW. Data classes allow implementing arbitrary data structures
> with the same value semantics in core, extensions or userland. For
> example, a `Vector` data class may look something like the following:
>
> ```php
> data class Vector {
> private $values;
>
> public function __construct(...$values) {
> $this->values = $values;
> }
>
> public mutating function append($value) {
> $this->values[] = $value;
> }
> }
>
> $a = new Vector(1, 2, 3);
> $b = $a;
> $b->append!(4);
> var_dump($a); // Vector(1, 2, 3)
> var_dump($b); // Vector(1, 2, 3, 4)
> ```
>
> An internal Vector implementation might offer a faster and stricter
> alternative to arrays (e.g. Vector from php-ds).
>
> Some other things to note about data classes:
>
> * Data classes are ordinary classes, and as such may implement
> interfaces, methods and more. I have not decided whether they should
> support inheritance.
> * Mutating method calls on data classes use a slightly different
> syntax: `$vector->append!(42)`. All methods mutating `$this` must be
> marked as `mutating`. The reason for this is twofold: 1. It signals to
> the caller that the value is modified. 2. It allows `$vector` to be
> cloned before knowing whether the method `append` is modifying, which
> hugely reduces implementation complexity in the engine.
> * Data classes customize identity (`===`) comparison, in the same way
> arrays do. Two data objects are identical if all their properties are
> identical (including order for dynamic properties).
> * Sharing data classes by-reference is possible using references, as
> you would for arrays.
> * We may decide to auto-implement `__toString` for data classes,
> amongst other things. I am still undecided whether this is useful for
> PHP.
> * Data classes protect from interior mutability. More concretely,
> mutating nested data objects stored in a `readonly` property is not
> legal, whereas it would be if they were ordinary objects.
> * In the future, it should be possible to allow using data classes in
> `SplObjectStorage`. However, because hashing is complex, this will be
> postponed to a separate RFC.
>
> One known gotcha is that we cannot trivially enforce placement of
> `modfying` on methods without a performance hit. It is the
> responsibility of the user to correctly mark such methods.
>
> Here's a fully functional PoC, excluding JIT:
> https://github.com/php/php-src/pull/13800
>
> Let me know what you think. I will start working on an RFC draft once
> work on property hooks concludes.
>
> Ilija

Neat! I've been playing around with "value-like" objects for awhile now:

https://github.com/withinboredom/time

Having inheritance supported would be useful, for example, 

Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread John Coggeshall

> So if we want to make sure that something like XY doesn't happen, we
> have to add some additional restrictions to those GPG keys.
>

Looks like all those geeky colleagues of ours back in the day having 
key-signing parties at conferences were on to something, maybe..
Let's be clear about something -- having GPG key requirements isn't going to 
help a situation like XZ. The XZ attack was done by an active maintainer of the 
project (who arguably manipulated the original maintainer of the project to 
become a maintainer themselves). It was as much a social engineering attack as 
anything.
Having GPG key requirements is all fine and dandy I suppose, but my 
tongue-in-cheek comment above has a real point behind it: GPG keys don't mean 
jack if you can't trust who owns the key. Unless we want to start limiting 
contributors to people who show up at conferences to do key signings of their 
GPG keys, I question exactly what this buys the project other than an illusion 
of security and additional complexity? I couldn't even really trust Derick to 
read me his GPG public key character-by-character over the phone now days 
thanks to AI.
Just Sayin'
John

Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Bilge

On 02/04/2024 15:55, Calvin Buckley wrote:

On Apr 2, 2024, at 11:15 AM, Derick Rethans  wrote:

What do y'all think about requiring GPG signed commits for the php-src
repository?

I had a look, and this is also something we can enforce through GitHub
as well (by using branch protections).

Would this affect only direct pushes to master, or would it be required
for pull requests too? I'd be worried the average drive-by contributor
wouldn't have GPG signing set up.


FWIW, I'm a drive-by contributor and I have GPG signing 
 set up.


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Andreas Heigl

Hey List, Hey Derick

Am 02.04.24 um 16:15 schrieb Derick Rethans:

Hi,

What do y'all think about requiring GPG signed commits for the php-src
repository?


In general I think it is a good idea to do GPG signed commits. But in 
terms of security the idea is to be able to authenticate a user. But the 
only thing we truly and reliably can do is connect a github account to a 
commit. Whether that commit author is actually Jane Doe or Karl Napp is 
still not necessarily proven.


So if we want to make sure that something like XY doesn't happen, we 
have to add some additional restrictions to those GPG keys.


If it is just to have signed commits: I am absolutely in favour.

Cheers

Andreas
--
  ,,,
 (o o)
+-ooO-(_)-Ooo-+
| Andreas Heigl   |
| mailto:andr...@heigl.org  N 50°22'59.5" E 08°23'58" |
| https://andreas.heigl.org   |
+-+
| https://hei.gl/appointmentwithandreas   |
+-+
| GPG-Key: https://hei.gl/keyandreasheiglorg  |
+-+


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Niels Dossche
On 02/04/2024 16:15, Derick Rethans wrote:
> Hi,
> 
> What do y'all think about requiring GPG signed commits for the php-src 
> repository?
> 
> I had a look, and this is also something we can enforce through GitHub 
> as well (by using branch protections).
> 
> cheers,
> Derick
> 
> 

I'm in favor of this.


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Larry Garfield
On Tue, Apr 2, 2024, at 12:17 AM, Ilija Tovilo wrote:
> Hi everyone!
>
> I'd like to introduce an idea I've played around with for a couple of
> weeks: Data classes, sometimes called structs in other languages (e.g.
> Swift and C#).

*gets popcorn*

> In a nutshell, data classes are classes with value semantics.
> Instances of data classes are implicitly copied when assigned to a
> variable, or when passed to a function. When the new instance is
> modified, the original instance remains untouched. This might sound
> familiar: It's exactly how arrays work in PHP.
>
> ```php
> $a = [1, 2, 3];
> $b = $a;
> $b[] = 4;
> var_dump($a); // [1, 2, 3]
> var_dump($b); // [1, 2, 3, 4]
> ```
>
> You may think that copying the array on each assignment is expensive,
> and you would be right. PHP uses a trick called copy-on-write, or CoW
> for short. `$a` and `$b` actually share the same array until `$b[] =
> 4;` modifies it. It's only at this point that the array is copied and
> replaced in `$b`, so that the modification doesn't affect `$a`. As
> long as a variable is the sole owner of a value, or none of the
> variables modify the value, no copy is needed. Data classes use the
> same mechanism.
>
> But why value semantics in the first place? There are two major flaws
> with by-reference semantics for data structures:
>
> 1. It's very easy to forget cloning data that is referenced somewhere
> else before modifying it. This will lead to "spooky actions at a
> distance". Having recently used JavaScript (where all data structures
> have by-reference semantics) for an educational IR optimizer,
> accidental mutations of shared arrays/maps/sets were my primary source
> of bugs.
> 2. Defensive cloning (to avoid issue 1) will lead to useless work when
> the value is not referenced anywhere else.
>
> PHP offers readonly properties and classes to address issue 1.
> However, they further promote issue 2 by making it impossible to
> modify values without cloning them first, even if we know they are not
> referenced anywhere else. Some APIs further exacerbate the issue by
> requiring multiple copies for multiple modifications (e.g.
> `$response->withStatus(200)->withHeader('X-foo', 'foo');`).
>
> As you may have noticed, arrays already solve both of these issues
> through CoW. Data classes allow implementing arbitrary data structures
> with the same value semantics in core, extensions or userland. For
> example, a `Vector` data class may look something like the following:
>
> ```php
> data class Vector {
> private $values;
>
> public function __construct(...$values) {
> $this->values = $values;
> }
>
> public mutating function append($value) {
> $this->values[] = $value;
> }
> }
>
> $a = new Vector(1, 2, 3);
> $b = $a;
> $b->append!(4);
> var_dump($a); // Vector(1, 2, 3)
> var_dump($b); // Vector(1, 2, 3, 4)
> ```
>
> An internal Vector implementation might offer a faster and stricter
> alternative to arrays (e.g. Vector from php-ds).
>
> Some other things to note about data classes:
>
> * Data classes are ordinary classes, and as such may implement
> interfaces, methods and more. I have not decided whether they should
> support inheritance.

What would be the reason not to?  As you indicated in another reply, the main 
reason some languages don't is to avoid large stack copies, but PHP doesn't 
have large stack copies for objects anyway so that's a non-issue.

I've long argued that the fewer differences there are between service classes 
and data classes, the better, so I'm not sure what advantage this would have 
other than "ugh, inheritance is such a mess" (which is true, but that ship 
sailed long ago).

> * Mutating method calls on data classes use a slightly different
> syntax: `$vector->append!(42)`. All methods mutating `$this` must be
> marked as `mutating`. The reason for this is twofold: 1. It signals to
> the caller that the value is modified. 2. It allows `$vector` to be
> cloned before knowing whether the method `append` is modifying, which
> hugely reduces implementation complexity in the engine.

As discussed in R11, it would be very beneficial if this marker could be on the 
method definition, not the method invocation.  You indicated that would be 
Hard(tm), but I think it's worth some effort to see if it's surmountably hard.  
(Or at least less hard than just auto-detecting it, which you indicated is 
Extremely Hard(tm).)

> * Data classes customize identity (`===`) comparison, in the same way
> arrays do. Two data objects are identical if all their properties are
> identical (including order for dynamic properties).
> * Sharing data classes by-reference is possible using references, as
> you would for arrays.
>
> * We may decide to auto-implement `__toString` for data classes,
> amongst other things. I am still undecided whether this is useful for
> PHP.

For reference:

Java record classes auto-generate equals(), toString(), hashCode(), and 
same-name methods (we don't need 

Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Arnaud Le Blanc
On Tue, Apr 2, 2024 at 4:16 PM Derick Rethans  wrote:

> What do y'all think about requiring GPG signed commits for the php-src
> repository?
>

+1


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Derick Rethans
On Tue, 2 Apr 2024, Calvin Buckley wrote:

> On Apr 2, 2024, at 11:15 AM, Derick Rethans  wrote:
> > 
> > What do y'all think about requiring GPG signed commits for the php-src 
> > repository?
> > 
> > I had a look, and this is also something we can enforce through GitHub 
> > as well (by using branch protections).
> 
> Would this affect only direct pushes to master, or would it be required
> for pull requests too? I'd be worried the average drive-by contributor
> wouldn't have GPG signing set up.

As Ayesh said, you can also use SSH for this now:
https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification#ssh-commit-signature-verification

I think it would apply to people merging the commits. But, I am not 100% 
sure (until we try, I suppose).

cheers,
Derick



-- 
https://derickrethans.nl | https://xdebug.org | https://dram.io

Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support

mastodon: @derickr@phpc.social @xdebug@phpc.social

Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Calvin Buckley
On Apr 2, 2024, at 11:15 AM, Derick Rethans  wrote:
> 
> What do y'all think about requiring GPG signed commits for the php-src 
> repository?
> 
> I had a look, and this is also something we can enforce through GitHub 
> as well (by using branch protections).

Would this affect only direct pushes to master, or would it be required
for pull requests too? I'd be worried the average drive-by contributor
wouldn't have GPG signing set up.

Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Rowan Tommins [IMSoP]
On Tue, 2 Apr 2024, at 15:15, Derick Rethans wrote:
> Hi,
>
> What do y'all think about requiring GPG signed commits for the php-src 
> repository?

I actually thought this was already required since the github move (and the 
events that led to it) 3 years ago.

It was certainly discussed: https://externals.io/message/113838#113840 and a 
user guide was created on the PHP wiki: https://wiki.php.net/vcs/commit-signing

Feedback for the idea was generally positive, but maybe nobody got around to 
actually doing it.

-- 
Rowan Tommins
[IMSoP]


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-04-02 Thread Olle Härstedt
internals+unsubscr...@lists.php.net -  550 5.7.1 Looks like spam to me.

Can't unsub...?

Den tis 2 apr. 2024 kl 16:46 skrev Jakub Zelenka :

> On Tue, Apr 2, 2024 at 3:35 PM tag Knife  wrote:
>
>>
>> On Tue, 2 Apr 2024 at 14:53, Jakub Zelenka  wrote:
>>
>>> We will still need RM to sign the build so ideally we should make it
>>> reproducible so RM can verify that CI produced expected build and then sign
>>> it and just upload the signatures (not sure if we actually need signature
>>> uploaded or if they are used just in announcements).
>>>
>>> I think this should then prevent compromise of the RM and CI unless CI
>>> is compromised by RM, of course, but that should be very unlikely.
>>>
>>> Regards
>>>
>>> Jakub
>>>
>>>
>> On the side of the CI being compromised, this does happen, typically with
>> authed
>> private hosted CI, like jenkins. But if its open and accessible to
>> everyone to monitor, such
>> as github actions, everyone can monitor and audit the build logs to
>> verify the commands
>> ran and nothing unexpected happened during build.
>>
>> That is something PHP is missing atm, no one can verify the build process
>> for releases.
>>
>
> Yes that's what I was suggesting. This should be done by RM. In that way,
> the RM becomes more someone that verifies the build and not the actual
> person that provides the build.
>
> Regards
>
> Jakub
>
>
>


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Ayesh Karunaratne
>
> Hi,
>
> What do y'all think about requiring GPG signed commits for the php-src
> repository?
>
> I had a look, and this is also something we can enforce through GitHub
> as well (by using branch protections).
>
> cheers,
> Derick
>
>
> --
> https://derickrethans.nl | https://xdebug.org | https://dram.io
>
> Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support
>
> mastodon: @derickr@phpc.social @xdebug@phpc.social

+1 from me as well, and quite good timing with all the xz fiasco just last week.

Git can also sign with SSH keys now, so this is now merely a config update


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread David CARLIER
No problem with this, I apply this since couple of days.

Cheers.

On Tue, 2 Apr 2024 at 15:37, Jakub Zelenka  wrote:

> On Tue, Apr 2, 2024 at 3:36 PM Jakub Zelenka  wrote:
>
>> On Tue, Apr 2, 2024 at 3:17 PM Derick Rethans  wrote:
>>
>>> Hi,
>>>
>>> What do y'all think about requiring GPG signed commits for the php-src
>>> repository?
>>>
>>>
>> +1, most of the devs already do that. I CC'd few of the regular devs that
>> don't sign commits (taken from the latest history) so they are aware of
>> this.
>>
>
> I meant regular committers, of course.
>


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-04-02 Thread Jakub Zelenka
On Tue, Apr 2, 2024 at 3:35 PM tag Knife  wrote:

>
> On Tue, 2 Apr 2024 at 14:53, Jakub Zelenka  wrote:
>
>> We will still need RM to sign the build so ideally we should make it
>> reproducible so RM can verify that CI produced expected build and then sign
>> it and just upload the signatures (not sure if we actually need signature
>> uploaded or if they are used just in announcements).
>>
>> I think this should then prevent compromise of the RM and CI unless CI is
>> compromised by RM, of course, but that should be very unlikely.
>>
>> Regards
>>
>> Jakub
>>
>>
> On the side of the CI being compromised, this does happen, typically with
> authed
> private hosted CI, like jenkins. But if its open and accessible to
> everyone to monitor, such
> as github actions, everyone can monitor and audit the build logs to verify
> the commands
> ran and nothing unexpected happened during build.
>
> That is something PHP is missing atm, no one can verify the build process
> for releases.
>

Yes that's what I was suggesting. This should be done by RM. In that way,
the RM becomes more someone that verifies the build and not the actual
person that provides the build.

Regards

Jakub


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Jakub Zelenka
On Tue, Apr 2, 2024 at 3:36 PM Jakub Zelenka  wrote:

> On Tue, Apr 2, 2024 at 3:17 PM Derick Rethans  wrote:
>
>> Hi,
>>
>> What do y'all think about requiring GPG signed commits for the php-src
>> repository?
>>
>>
> +1, most of the devs already do that. I CC'd few of the regular devs that
> don't sign commits (taken from the latest history) so they are aware of
> this.
>

I meant regular committers, of course.


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Jakub Zelenka
On Tue, Apr 2, 2024 at 3:17 PM Derick Rethans  wrote:

> Hi,
>
> What do y'all think about requiring GPG signed commits for the php-src
> repository?
>
>
+1, most of the devs already do that. I CC'd few of the regular devs that
don't sign commits (taken from the latest history) so they are aware of
this.

Cheers

Jakub


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-04-02 Thread tag Knife
On Tue, 2 Apr 2024 at 14:53, Jakub Zelenka  wrote:

> We will still need RM to sign the build so ideally we should make it
> reproducible so RM can verify that CI produced expected build and then sign
> it and just upload the signatures (not sure if we actually need signature
> uploaded or if they are used just in announcements).
>
> I think this should then prevent compromise of the RM and CI unless CI is
> compromised by RM, of course, but that should be very unlikely.
>
> Regards
>
> Jakub
>
>
On the side of the CI being compromised, this does happen, typically with
authed
private hosted CI, like jenkins. But if its open and accessible to everyone
to monitor, such
as github actions, everyone can monitor and audit the build logs to verify
the commands
ran and nothing unexpected happened during build.

That is something PHP is missing atm, no one can verify the build process
for releases.


Re: [PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Sebastian Bergmann

Am 02.04.2024 um 16:15 schrieb Derick Rethans:

What do y'all think about requiring GPG signed commits for the php-src
repository?


+1


[PHP-DEV] [VOTE] PHP 8.4 Release Managers

2024-04-02 Thread Jakub Zelenka
Hi all,

In the role of "Veteran" release manager, Eric Mann [0], the PHP 8.3
release manager,
has volunteered to mentor two rookies, so there will be two seats up for
grabs.

For those two rookie seats, we’ve got three eager candidates for your
consideration [1-4].

Voting is now open on https://wiki.php.net/todo/php84 using "Single
Transferrable Vote" (STV).
Those who participated in prior elections will recognize the format;
for the rest, the TL;DR is that it allows each voter to state their
preference order by voting multiple times.

There are three polls on the wiki for your three preferences, in descending
order.

Using some math that I’ll leave to Wikipedia[5] to explain,
we’ll start with the 1st preference and gradually remove candidates with
the fewest votes,
transferring votes that had previously gone to them to their voter’s 2nd
preference, and so on.
Once two candidates have a quorum (Droop quota), those will be officially
selected as our RMs.

I will ask Derick Rethans to proctor the tabulation of the votes, since he
hopefully has still scripts from last year.

As you consider each candidate, please bear in mind that this is a 3.5 year
(or potentially 4.5 if the extended security support RFC passes) commitment
and is a position of trust.

Thank you in advance for your consideration.

Your 8.3 Release Managers,
Jakub Zelenka, Eric Mann & Pierrick Charron

Vote Opened: 2 April 2023 15:00:00 UTC
Vote Closes: 16 April 2023 15:00:00 UTC


Refs:
0 - Eric Mann: https://news-web.php.net/php.internals/122580
1 - Yuya Hamada: https://news-web.php.net/php.internals/122577
2 - Calvin Buckley: https://news-web.php.net/php.internals/122586
3 - Saki Takamachi: https://news-web.php.net/php.internals/122587
4 - Matteo Beccati: https://news-web.php.net/php.internals/122591
5 - https://en.wikipedia.org/wiki/Single_transferable_vote


[PHP-DEV] Requiring GPG Commit Signing

2024-04-02 Thread Derick Rethans
Hi,

What do y'all think about requiring GPG signed commits for the php-src 
repository?

I had a look, and this is also something we can enforce through GitHub 
as well (by using branch protections).

cheers,
Derick


-- 
https://derickrethans.nl | https://xdebug.org | https://dram.io

Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support

mastodon: @derickr@phpc.social @xdebug@phpc.social


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-04-02 Thread Jakub Zelenka
Hi,

On Tue, Apr 2, 2024 at 2:36 PM Derick Rethans  wrote:

> On Sat, 30 Mar 2024, Jakub Zelenka wrote:
>
> > On Sat, Mar 30, 2024 at 7:08 AM Marco Pivetta 
> wrote:
> > >
> > > I understand that the XZ project had signed releases too: that still
> > > means that downstream consumers would need to trust the release
> > > managers anyway, and reproduce the whole chain themselves.
> > >
> > > I suppose that's part of OP's concern.
> > >
> > I agree that compromised RM is a problem that we should look into.
> >
> > We have been actually already discussing something similar. I have
> > been thinking about it and it could be potentially used for all
> > builds. The idea is that we would setup worklfow on CI that would run
> > on tag push and it would call (authenticated https request)
> > downloads.php.net server that could do the actual build, sign them and
> > return the hashes to the CI job which would display them and do extra
> > verification (probably its own build to verify that download server
> > work as expected).
>
> ...
>
> > It needs more thinking to iron out all details and make sure it is a
> > secure but I think it would be something worth to look at.
>
> I don't mind coming up with an automated way, but we probably should not
> use the *downloads* server. All it does is serve files. It has no
> compiler or anything else. It's a storage optimised instance with little
> CPU.
>
>
Yeah I agree. I originally thought that it would be good to do it on our
own server so we can possibly sign it there as well but after thinking
about it I rejected that signing idea so there's really no point to do it
on our own server.


> On CI we already test the builds, what does stop us from also just
> having it make the tarball and attach it as an artefact? We can then
> setup somethin gon the downloads server to pull these artefacts. In
> fact, this is exactly what we're already hoping to do for Windows
> downloads too. Having it all in one place is probably even better (and
> easier).
>
> Of course, having CI make the tarballs means we need to trust that CI
> isn't compromised ;-).
>

We will still need RM to sign the build so ideally we should make it
reproducible so RM can verify that CI produced expected build and then sign
it and just upload the signatures (not sure if we actually need signature
uploaded or if they are used just in announcements).

I think this should then prevent compromise of the RM and CI unless CI is
compromised by RM, of course, but that should be very unlikely.

Regards

Jakub


Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-02 Thread Barney Laurance

On 2024-04-02 12:26, Saki Takamachi wrote:


Also, an idea occurred to me while reading your comments.

The current assumption is that a Number always holds a single value.
How if we made it so that it held two values? They are the numerator
and the denominator.


Then we'd have a rational number, instead of an arbitrary precision 
decimal. I think that's a sufficiently different data type that it 
should be a different class (if required), and probably a separate RFC, 
and for now it's better to stay closer to the existing BCMath API.


Developers should be prepared to accept that an arbitrary precision 
decimal can't represent 1/3 exactly, just like a binary float can't 
represent 1/10 exactly.


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-04-02 Thread Derick Rethans
On Sat, 30 Mar 2024, Jakub Zelenka wrote:

> On Sat, Mar 30, 2024 at 7:08 AM Marco Pivetta  wrote:
> >
> > I understand that the XZ project had signed releases too: that still 
> > means that downstream consumers would need to trust the release 
> > managers anyway, and reproduce the whole chain themselves.
> >
> > I suppose that's part of OP's concern.
> >
> I agree that compromised RM is a problem that we should look into.
> 
> We have been actually already discussing something similar. I have 
> been thinking about it and it could be potentially used for all 
> builds. The idea is that we would setup worklfow on CI that would run 
> on tag push and it would call (authenticated https request) 
> downloads.php.net server that could do the actual build, sign them and 
> return the hashes to the CI job which would display them and do extra 
> verification (probably its own build to verify that download server 
> work as expected).

...

> It needs more thinking to iron out all details and make sure it is a 
> secure but I think it would be something worth to look at.

I don't mind coming up with an automated way, but we probably should not 
use the *downloads* server. All it does is serve files. It has no 
compiler or anything else. It's a storage optimised instance with little 
CPU.

On CI we already test the builds, what does stop us from also just 
having it make the tarball and attach it as an artefact? We can then 
setup somethin gon the downloads server to pull these artefacts. In 
fact, this is exactly what we're already hoping to do for Windows 
downloads too. Having it all in one place is probably even better (and 
easier).

Of course, having CI make the tarballs means we need to trust that CI 
isn't compromised ;-).

cheers,
Derick

-- 
https://derickrethans.nl | https://xdebug.org | https://dram.io

Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support

mastodon: @derickr@phpc.social @xdebug@phpc.social

Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-02 Thread Derick Rethans
On Fri, 29 Mar 2024, Jordan LeDoux wrote:

> On Wed, Mar 27, 2024 at 12:08 AM Aleksander Machniak  wrote:
> 
> > On 27.03.2024 01:03, Saki Takamachi wrote:
> > >> $num = new BcNum('1.23', 2);
> > >> $result = $num + '1.23456';
> > >> $result->value; // '2.46456'
> > >> $result->scale; // ??
> > >
> > > In this case, `$result->scale` will be `'5'`. I added this to the 
> > > RFC.
> >
> > I'm not sure I like this. Maybe we should be more strict here and 
> > treat the $scale in constructor (and later withScale()) as the 
> > actual scale for all operations.
> >
> >
> For addition, it absolutely should expand scale like this, unless the 
> constructor also defines a default rounding type that is used in that 
> situation. All numbers, while arbitrary, will be finite, so addition 
> will always be exact and known based on inputs prior to calculation.
> 
> Treating scale like this isn't more strict, it's confusing. For 
> instance:
> 
> ```
> $numA = new Number('1.23', 2);
> $numB = new Number('1.23456', 5);
> 
> $expandedScale1 = $numA + $numB; // 2.46456
> $expandedScale2 = $numB + $numA; // 2.46456
> 
> $strictScale1 = $numA + $numB; // 2.46 assuming truncation
> $strictScale2 = $numB + $numA; // 2.46456
> ```
> 
> I ran into this same issue with operand ordering when I was writing my 
> operator overload RFC.
> 
> There are ways you could do the overload implementation that would get 
> around this for object + object operations, but it's also 
> mathematically unsound and probably unexpected for anyone who is going 
> to the trouble of using an arbitrary precision library.
> 
> Addition and subtraction should automatically use the largest scale 
> from all operands. Division and multiplication should require a 
> specified scale.

I agree. I think add/subtract also should always take the largest scale 
here.

cheers,
Derick

Re: [PHP-DEV] First time contributor (DateTime::setDate PR)

2024-04-02 Thread Derick Rethans
Hi,

On Sun, 31 Mar 2024, Bilge wrote:

> About the PR: I sometimes find it would be useful to only update part of the
> date. The PR makes all parameters to DateTime(Immutable)::setDate
>  optional in a
> backwards-compatible manner such that we can elect to update only the day,
> month, year or any combination of the three (thanks, in part, to named
> parameters). Without this modification, we must always specify all of the day,
> month and year parameters to change the date.

As I mentioned to you in Room 11, I am not in favour of adhoc API 
changes to Date/Time classes. It has now been nearly 18 years since they 
were originally introduced, and they indeed could do with an overhaul.

I have been colllecting ideas in 
https://docs.google.com/document/d/1pxPSRbfATKE4TFWw72K3p7ir-02YQbTf3S3SIxOKWsk/edit#heading=h.2jol7kfhmijb

Having different/better modifiers would also be a good thing to talk 
about, albeit perhaps on the four mentioned new classes, instead of 
adding them to the already existing DateTime and DateTimeImmutable 
classes.

In any case, just allowing setDate to be able to just modify the month 
is going to introduce confusion, as this will be counter intuitive:

$dt = new DateTimeImmutable("2024-03-31");
$newDt = $dt->setDate( month: 2 );

It is now representing 2024-03-02.

This might be the right answer, but it might also be that the developer 
just cared about the month part (and not the day-of-month), in which 
case this is a WTF moment.

Picking mofication APIs is not as trivial as it seems, and I would like 
to do it *right*.

Feel free to add comments and wishes to the google doc document. In the 
near future, I will be writing up an RFC from this.

cheers,
Derick

-- 
https://derickrethans.nl | https://xdebug.org | https://dram.io

Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support

mastodon: @derickr@phpc.social @xdebug@phpc.social


[PHP-DEV] unsubscribe

2024-04-02 Thread Joao Pedro Paula Pannain Souza



Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-02 Thread Saki Takamachi
Hi Jordan, Lynn,

> Something like the signature for `getNumber()` in this example would be a 
> decent solution. Operations which have ambiguous scale (of which truly only 
> div is in the BCMath library) should *require* scale in the method that calls 
> the calculation, however for consistency I can certainly see the argument for 
> requiring it for all calculation methods. The issue is how you want to handle 
> that for operator overloads, since you cannot provide arguments in that 
> situation.
> 
> Probably the most sensible way (and I think the way I handled it as well in 
> my library) is to look at both the left and right operand, grab the 
> calculated scale of the input for both (or the set scale if the scale has 
> been manually set), and then calculate with a higher scale. If internally it 
> produces a rounded result, the calculation should be done at `$desireScale + 
> 2` to avoid compound rounding errors from the BCMath library and then the 
> implementation. If the result is truncated, the calculation should be done at 
> `$desiredScale + 1` to avoid calculating unnecessary digits.
> 
> So we have multiple usage scenarios and the behavior needs to remain 
> consistent no matter which usage occurs, and what order the items are called 
> in, so long as the resulting calculation is the same.
> 
> **Method Call**
> $bcNum = new Number('1.0394567'); // Input scale is implicitly 7
> $bcNum->div('1.2534', 3); // Resulting scale is 3
> $bcNum->div('1.2534'); // Implicit scale of denominator is 4, Implicit scale 
> of numerator is 7, calculate with scale of 8 then truncate
> 
> **Operators**
> $bcNum = new Number('1.0394567'); // Input scale is implicitly 7
> $bcNum / '1.2534'; // Implicit scale of denominator is 4, Implicit scale of 
> numerator is 7, calculate with scale of 8 then truncate
> 
> This allows you to perhaps keep an input scale in the constructor and also 
> maintain consistency across various calculations. But whatever the behavior 
> is, it should be mathematically sound, consistent across different syntax for 
> the same calculation, and never reducing scale UNLESS it is told to do so in 
> the calculation step OR during the value retrieval.

> I'm inexperienced when it comes to maths and the precision here, but I do 
> have some experience when it comes to what the business I work for wants. 
> I've implemented BCMath in a couple of places where this kind of precision is 
> necessary, and I found that whenever I do divisions I prefer having at least 
> 2 extra digits. Would it make sense to internally always just store a more 
> accurate number? For things like additions/multiplications/subtractions it 
> could always use the highest precision, and then for divisions add like +3~6 
> or something. Whenever you have numbers that have a fraction like `10.5001` 
> it makes sense to set it to 4, but when you have `10` it suddenly becomes 0 
> when implicitly setting it. 
> 
> For the following examples assume each number is a BcNum:
> When doing something like `10 * 10. * 10.0` I want the end result 
> to have a precision of at least 9 so I don't lose information. When I do 
> `((10 / 3) * 100) * 2` I don't want it to implicitly become 0, because the 
> precision here is important to me. I don't think using infinite precision 
> here is a reasonable approach either. I'm not sure what the correct answer 
> is, perhaps it's just "always manually set the precision"?

Thanks for the important perspective feedback.

One thing I overlooked: if the exponent of pow is negative, the scale of the 
result becomes unpredictable, just like with div.

e.g.
```
3 ** -1
= 0.33.
```

Also, an idea occurred to me while reading your comments.

The current assumption is that a Number always holds a single value. How if we 
made it so that it held two values? They are the numerator and the denominator.

This means that when we do division, no division is done internally, but we 
actually multiply the denominator. At the very end of the process, when 
converting to string, any reserved division is performed according to the 
specified scale.

If we have the option of not specifying a scale when converting to a string, it 
may be preferable to convert based on an implicit scale.

Regards.

Saki

Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-02 Thread Lynn
On Tue, Apr 2, 2024 at 11:17 AM Jordan LeDoux 
wrote:

>
>
> On Sat, Mar 30, 2024 at 5:09 PM Saki Takamachi  wrote:
>
>> Hi Jordan,
>>
>> Your opinion may be reasonable given the original BCMath calculation
>> order. That is, do you intend code like this?
>>
>> Signature:
>> ```
>> // public function __construct(string|int $number)
>> // public function getNumber(?int $scale = null): string
>> ```
>>
>> Add:
>> ```
>> // public function add(Number|string|int $number): string
>>
>> $num = new Number('1.23456');
>> $num2 = new Number('1.23');
>>
>> $add = $num + $num2;
>> $add->getNumber(); // '2.46456'
>> $add->getNumber(1); // ‘2.4'
>>
>> $add = $num->add($num2);
>> $add->getNumber(); // '2.46456'
>> $add->getNumber(1); // '2.4'
>> ```
>>
>> Div:
>> ```
>> // public function div(Number|string|int $number, int
>> $scaleExpansionLimit = 10): string
>>
>>
>> // case 1
>> $num = new Number('0.0001');
>> $num2 = new Number('3');
>>
>> $div = $num / $num2; // scale expansion limit is always 10
>> $div->getNumber(); // '0.3'
>>
>> $div = $num->div($num2, 20);
>> $div->getNumber(); // '0.333'
>> $div->getNumber(7); // ‘0.333'
>>
>>
>> // case 2
>> $num = new Number('1.11');
>> $num2 = new Number('3');
>>
>> $div = $num->div($num2, 3);
>> $div->getNumber(); // '0.370'
>> $div->getNumber(7); // ‘0.370'
>> ```
>>
>> Since the scale can be inferred for everything other than div, a special
>> argument is given only for div.
>>
>> Regards.
>>
>> Saki
>
>
> Something like the signature for `getNumber()` in this example would be a
> decent solution. Operations which have ambiguous scale (of which truly only
> div is in the BCMath library) should *require* scale in the method that
> calls the calculation, however for consistency I can certainly see the
> argument for requiring it for all calculation methods. The issue is how you
> want to handle that for operator overloads, since you cannot provide
> arguments in that situation.
>
> Probably the most sensible way (and I think the way I handled it as well
> in my library) is to look at both the left and right operand, grab the
> calculated scale of the input for both (or the set scale if the scale has
> been manually set), and then calculate with a higher scale. If internally
> it produces a rounded result, the calculation should be done at
> `$desireScale + 2` to avoid compound rounding errors from the BCMath
> library and then the implementation. If the result is truncated, the
> calculation should be done at `$desiredScale + 1` to avoid calculating
> unnecessary digits.
>
> So we have multiple usage scenarios and the behavior needs to remain
> consistent no matter which usage occurs, and what order the items are
> called in, so long as the resulting calculation is the same.
>
> **Method Call**
> $bcNum = new Number('1.0394567'); // Input scale is implicitly 7
> $bcNum->div('1.2534', 3); // Resulting scale is 3
> $bcNum->div('1.2534'); // Implicit scale of denominator is 4, Implicit
> scale of numerator is 7, calculate with scale of 8 then truncate
>
> **Operators**
> $bcNum = new Number('1.0394567'); // Input scale is implicitly 7
> $bcNum / '1.2534'; // Implicit scale of denominator is 4, Implicit scale
> of numerator is 7, calculate with scale of 8 then truncate
>
> This allows you to perhaps keep an input scale in the constructor and also
> maintain consistency across various calculations. But whatever the behavior
> is, it should be mathematically sound, consistent across different syntax
> for the same calculation, and never reducing scale UNLESS it is told to do
> so in the calculation step OR during the value retrieval.
>
> Jordan
>

I'm inexperienced when it comes to maths and the precision here, but I do
have some experience when it comes to what the business I work for wants.
I've implemented BCMath in a couple of places where this kind of precision
is necessary, and I found that whenever I do divisions I prefer having at
least 2 extra digits. Would it make sense to internally always just store a
more accurate number? For things like
additions/multiplications/subtractions it could always use the highest
precision, and then for divisions add like +3~6 or something. Whenever you
have numbers that have a fraction like `10.5001` it makes sense to set it
to 4, but when you have `10` it suddenly becomes 0 when implicitly setting
it.

For the following examples assume each number is a BcNum:
When doing something like `10 * 10. * 10.0` I want the end
result to have a precision of at least 9 so I don't lose information. When
I do `((10 / 3) * 100) * 2` I don't want it to implicitly become 0, because
the precision here is important to me. I don't think using infinite
precision here is a reasonable approach either. I'm not sure what the
correct answer is, perhaps it's just "always manually set the precision"?


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Ilija Tovilo
Hi Alexander

On Tue, Apr 2, 2024 at 4:53 AM Alexander Pravdin  wrote:
>
> On Tue, Apr 2, 2024 at 9:18 AM Ilija Tovilo  wrote:
> >
> > I'd like to introduce an idea I've played around with for a couple of
> > weeks: Data classes, sometimes called structs in other languages (e.g.
> > Swift and C#).
>
> While I like the idea, I would like to suggest something else in
> addition or as a separate feature. As an active user of readonly
> classes with all promoted properties for data-holding purposes, I
> would be happy to see the possibility of cloning them with passing
> some properties to modify:
>
> readonly class Data {
> function __construct(
> public string $foo,
> public string $bar,
> public string $baz,
> ) {}
> }
>
> $data = new Data(foo: 'A', bar: 'B', baz: 'C');
>
> $data2 = clone $data with (bar: 'X', baz: 'Y');

What you're asking for is part of the "Clone with" RFC:
https://wiki.php.net/rfc/clone_with

This issue is valid and the RFC would improve the ergonomics of
readonly classes.

However, note that it really only addresses a small part of what this
RFC tries achieve:

> Some APIs further exacerbate the issue by
requiring multiple copies for multiple modifications (e.g.
`$response->withStatus(200)->withHeader('X-foo', 'foo');`).

Readonly works fine for compact data structures, even if it is copied
more than it needs. For large data structures, like large lists, a
copy for each modification would be detrimental.

https://3v4l.org/GR6On

See how the performance of an insert into an array tanks if a copy of
the array is performed in each iteration (due to an additional
reference to it). Readonly is just not viable for data structures such
as lists, maps, sets, etc.

Ilija


Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-04-02 Thread Jordan LeDoux
On Sat, Mar 30, 2024 at 5:09 PM Saki Takamachi  wrote:

> Hi Jordan,
>
> Your opinion may be reasonable given the original BCMath calculation
> order. That is, do you intend code like this?
>
> Signature:
> ```
> // public function __construct(string|int $number)
> // public function getNumber(?int $scale = null): string
> ```
>
> Add:
> ```
> // public function add(Number|string|int $number): string
>
> $num = new Number('1.23456');
> $num2 = new Number('1.23');
>
> $add = $num + $num2;
> $add->getNumber(); // '2.46456'
> $add->getNumber(1); // ‘2.4'
>
> $add = $num->add($num2);
> $add->getNumber(); // '2.46456'
> $add->getNumber(1); // '2.4'
> ```
>
> Div:
> ```
> // public function div(Number|string|int $number, int $scaleExpansionLimit
> = 10): string
>
>
> // case 1
> $num = new Number('0.0001');
> $num2 = new Number('3');
>
> $div = $num / $num2; // scale expansion limit is always 10
> $div->getNumber(); // '0.3'
>
> $div = $num->div($num2, 20);
> $div->getNumber(); // '0.333'
> $div->getNumber(7); // ‘0.333'
>
>
> // case 2
> $num = new Number('1.11');
> $num2 = new Number('3');
>
> $div = $num->div($num2, 3);
> $div->getNumber(); // '0.370'
> $div->getNumber(7); // ‘0.370'
> ```
>
> Since the scale can be inferred for everything other than div, a special
> argument is given only for div.
>
> Regards.
>
> Saki


Something like the signature for `getNumber()` in this example would be a
decent solution. Operations which have ambiguous scale (of which truly only
div is in the BCMath library) should *require* scale in the method that
calls the calculation, however for consistency I can certainly see the
argument for requiring it for all calculation methods. The issue is how you
want to handle that for operator overloads, since you cannot provide
arguments in that situation.

Probably the most sensible way (and I think the way I handled it as well in
my library) is to look at both the left and right operand, grab the
calculated scale of the input for both (or the set scale if the scale has
been manually set), and then calculate with a higher scale. If internally
it produces a rounded result, the calculation should be done at
`$desireScale + 2` to avoid compound rounding errors from the BCMath
library and then the implementation. If the result is truncated, the
calculation should be done at `$desiredScale + 1` to avoid calculating
unnecessary digits.

So we have multiple usage scenarios and the behavior needs to remain
consistent no matter which usage occurs, and what order the items are
called in, so long as the resulting calculation is the same.

**Method Call**
$bcNum = new Number('1.0394567'); // Input scale is implicitly 7
$bcNum->div('1.2534', 3); // Resulting scale is 3
$bcNum->div('1.2534'); // Implicit scale of denominator is 4, Implicit
scale of numerator is 7, calculate with scale of 8 then truncate

**Operators**
$bcNum = new Number('1.0394567'); // Input scale is implicitly 7
$bcNum / '1.2534'; // Implicit scale of denominator is 4, Implicit scale of
numerator is 7, calculate with scale of 8 then truncate

This allows you to perhaps keep an input scale in the constructor and also
maintain consistency across various calculations. But whatever the behavior
is, it should be mathematically sound, consistent across different syntax
for the same calculation, and never reducing scale UNLESS it is told to do
so in the calculation step OR during the value retrieval.

Jordan


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-02 Thread Ilija Tovilo
Hi Marco

On Tue, Apr 2, 2024 at 2:56 AM Deleu  wrote:
>
>
>
> On Mon, Apr 1, 2024 at 9:20 PM Ilija Tovilo  wrote:
>>
>> I'd like to introduce an idea I've played around with for a couple of
>> weeks: Data classes, sometimes called structs in other languages (e.g.
>> Swift and C#).
>>
>> snip
>>
>> Some other things to note about data classes:
>>
>> * Data classes are ordinary classes, and as such may implement
>> interfaces, methods and more. I have not decided whether they should
>> support inheritance.
>
> I'd argue in favor of not including inheritance in the first version. Taking 
> inheritance out is an impossible BC Break. Not introducing it in the first 
> stable release gives users a chance to evaluate whether it's something we 
> will drastically miss.

I would probably agree. I believe the reasoning some languages don't
support inheritance for value types is because they are stored on the
stack. Inheritance encourages large structures, but copying very large
structures over and over on the stack may be slow.

In PHP, objects always live on the heap, and due to CoW we don't have
this problem. Still, it may be beneficial to disallow inheritance
first, and relax this restriction if it is necessary.

>> * Mutating method calls on data classes use a slightly different
>> syntax: `$vector->append!(42)`. All methods mutating `$this` must be
>> marked as `mutating`. The reason for this is twofold: 1. It signals to
>> the caller that the value is modified. 2. It allows `$vector` to be
>> cloned before knowing whether the method `append` is modifying, which
>> hugely reduces implementation complexity in the engine.
>
> I'm not sure if I understood this one. Do you mean that the `!` modifier here 
> (at call-site) is helping the engine clone the variable before even diving 
> into whether `append()` has been tagged as mutating?

Precisely. The issue comes from deeper nested values:

$circle->position->zero();

Imagine that Circle is a data class with a Position, which is also a
data class. Position::zero() is a mutating method that sets the
coordinates to 0:0. For this to work, not only the position needs to
be copied, but also $circle. However, the engine doesn't yet know
ahead of time whether zero() is mutating, and as such needs to perform
a copy.

One idea was to evaluate the left-hand-side of the method call, and
repeat it with a copy if the method is mutating. However, this is not
trivially possible, because opcodes consume their operands. So, for an
expression like `getCircle()->position->zero()`, the return value of
`getCircle()` is already gone. `!` explicitly distinguishes the call
from non-mutating calls, and knows that a copy will be needed.

But as mentioned previously, I think a different syntax offers
additional benefits for readability.

> From outside it looks odd that a clone would happen ahead-of-time while 
> talking about copy-on-write. Would this syntax break for non-mutating methods?

If by break you mean the engine would error, then yes. Only mutating
methods may (and must) be called with the $foo->bar!() syntax.

Ilija


[PHP-DEV] Re: [RFC] [Discussion] [VOTE] Rounding Integers as int

2024-04-02 Thread Marc Bennewitz

Hi internals,

On 17.03.24 13:23, Marc Bennewitz wrote:

Hello internals,

I have opened the vote for the "Rounding Integers as int" RFC:
https://wiki.php.net/rfc/integer-rounding

Do to Easter weekend the vote will run for two weeks and two days 
until Tue the 2nd of April 2024.



The RFC has been declined with 18 votes against and 0 in favor.

Kind regards,
Marc Bennewitz




OpenPGP_0x3936ABF753BC88CE.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature