Re: [PHP-DEV] Re: Alias for `int|float`

2022-09-28 Thread Hamza Ahmad
Hi Olle,

I appreciate your idea of introducing a similar concept of typedef.
What if you write an RFC explaining that concept. I can join you
however in co-authoring this request.

Seriously, I have had issues with writing such type again and again.
If you look at mixed type, it is indeed an alias of some union types.
Adding such an option to userland would lead to introduction of
various type hints. Plus, it will help devs write short hand aliases
for their intersection types.

Regards

Ahmad


On 9/27/22, Olle Härstedt  wrote:
>>Hello, internals!
>>
>>What do you think about creating shorthand `number` for `int|float`?
>
> This is also related to allow all types of type-hints in the use-statement,
> like
>
> use int|float as numeric;
>
> Anyone interested in creating an RFC for this?
>
> Olle
>
> PS. Original thread here: https://externals.io/message/112209#112209
> (dunno how to respond to an old thread when I don't have the original
> message)
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: https://www.php.net/unsub.php
>
>

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC [Discussion]: Improve unserialize() error handling

2022-09-28 Thread Tim Düsterhus

Hi

On 9/28/22 17:42, Tim Düsterhus wrote:

Would the increase of E_NOTICE to E_WARNING for the two cases that
currently emit E_NOTICE be something that even requires a vote or is
this something that can simply be decided by merging a PR? [1] In that
case the second and third vote could be simplified to "Convert E_* to
Exception" with the regular 2/3 majority required.

[1] I would hope that unifying E_NOTICE/E_WARNING to E_WARNING
everywhere is uncontroversial.


I've now created a standalone PR for this:

https://github.com/php/php-src/pull/9629

While creating that one, I also came across this:

https://github.com/php/php-src/pull/9630

Best regards
Tim Düsterhus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC [Discussion]: Improve unserialize() error handling

2022-09-28 Thread Tim Düsterhus

Hi

On 9/28/22 19:53, Nicolas Grekas wrote:

I'm not quoting the full text of the discussion because the php ML server
refused the message - too long it said.


Yeah, that's fine. I also aggressively prune irrelevant parts from 
whatever I quote; saves traffic and storage, and also makes it easier to 
find the answers. Additional context is available by looking up the 
quoted message.



Based on my analysis and understanding of the linked examples, the only
one that *might* break (I don't fully understand it), is the one in
src/Symfony/Component/VarExporter/Internal/Registry.php. The others will
either no change or change to be more correct.

[…]



As you saw, the BC break is real. That's what I wanted to highlight and
what worries me for the community at large. This is not some niche BC break.


That's not at all what I'd say I've seen. My argument (see quote) was 
that while it *technically* breaks BC under a strict definition, it does 
not change behavior in a *negative way*. I don't find this strict 
definition of BC break particularly useful, because that effectively 
results in https://xkcd.com/1172/ ("Workflow"), preventing all progress.



There is also an argument that is missing in this conversation and that
Yasuo made (in another thread by mistake I guess?):


Yes, I have seen that one. For reference of the other readers, the email 
is here: https://externals.io/message/118554#118684.



In general, unserializing external serialised data is the same as "Read and

execute remote PHP code"
(Executing arbitrary code via memory corruption is trivial task with
unserialize())


I am aware of that and acknowledged the security issue in the third list 
point of this section:


https://wiki.php.net/rfc/improve_unserialize_error_handling#increase_the_error_reporting_severity_in_the_unserialize_parser


In all valid use cases of unserialize(), we do trust the backend where we
get serialized payloads from. This means constructed broken payloads are
not something we need to deal with (the DateTimeImmutable example you
gave).


Okay, that's fair with regard to DTI. The 'Error' case still exists for 
readonly classes that lose properties, so ...



The thing we need to guard against is much narrower. That makes the problem
this PR aims to solve much smaller. Truncated payloads are a thing because
of race conditions or unreliable network services. FC/BC of serialized
payloads is another thing that can only be dealt with by implementing
__unserialize() or the likes. And that's mostly it. Other errors (e.g.


... while it is true that this FC/BC of payloads needs to be dealt with 
by implementing __unserialize(), you can't rely on every class doing 
that (correctly). That's why you currently need to catch 'Throwable', 
not just 'Exception' and that's why the RFC proposes to wrap the Throwables.


Truncated payloads should not happen even with unreliable network 
services, because of integrity checks embedded into the protocol (e.g. 
TLS). In any case, truncated payloads will currently emit E_NOTICE, so 
that relates to the second vote (E_WARNING to Exception) more than the 
first vote (wrap Exception), whereas I understand your concerns are 
mostly with the "wrap Exception" part.



parse error) are not caused by unserialize itself, but by regular user code.



Sorry, I don't understand what "parse error" or "other errors" you are 
referring to.



"RFC: Static variables in inherited methods" -
https://wiki.php.net/rfc/static_variable_inheritance

That RFC actually changed the behavior of the application I maintain.
You voted in favor of the RFC, but did not participate in the
discussion. The fix was simple and I agree that the new behavior is
better, so no hard feelings. I'd be curious though how the breakage from
that RFC is more acceptable to you than the anticipated breakage from
this RFC? Honest question to understand your PoV better.



Rereading the RFC, Nikita described a niche and obscure behavior of the
engine. There are no numbers to support that understanding (how "niche"
this was) and that's missing in the RFC. I remember other RFCs which did
have numbers to quantify the impact. I guess it was hard to get that


Understood, thank you for looking into this. I believe raw numbers are 
not the (most) important metric. IMO subtlety of the break is more 
important. A subtle change that affects a small minority is worse than a 
less subtle change that affects more users.



number. And that's also why I made this gist: so that the impact of the BC
break could be evaluated somehow. Many spots in Symfony likely means many
more spots in the PHP community. I'm not going to be able to conduct a


I didn't and don't expect you to perform a larger-scale analysis, that's 
hard to do, because one has to look at the *context* of the 
'unserialize()' call to determine what exactly the intended behavior is. 
That's what I attempted to do with your Symfony examples (I am not a 
Symfony user and might have 

Re: [PHP-DEV] RFC [Discussion]: Improve unserialize() error handling

2022-09-28 Thread Nicolas Grekas
Hi Tim,

I'm not quoting the full text of the discussion because the php ML server
refused the message - too long it said.

> this gist:
> > https://gist.github.com/nicolas-grekas/5dd3169f94ed3b4576152605330824fe
> >
> >
>
> [...] (see previous messages on the thread for this missing part)
>
> >> I'm asking, because the examples fail to explain the "why". Why is the
> >> code written like this? I fail to see how catching "Exception", but not
> >> catching "Error" during unserialization is ever going to be useful.
> >>
> >
> > It's useful because these two roots mean very different things.
>
> I understand the distinction between Error and Exception in general.
>
> However I've specifically asked about the distinction in the context of
> unserialization. An Exception during unserialization is not more or less
> a programmer error compared to an Error. You generally put in some
> opaque string into unserialize and you hopefully get out some useful
> value. If the string is broken you either get an E_FOO or some random
> Throwable you can't control. You can't magically fix the input string
> and you can't really prevent the errors happening either.
>
> As pointed out in the list above: DateTimeImmutable will already throw
> an Error, whereas other classes will throw an Exception. That does not
> mean that an error during unserialization of DateTimeImmutable is worse,
> because the choice is pretty arbitrary.
>
> The documentation of unserialize()
> (https://www.php.net/manual/en/function.unserialize.php) also indicates
> that:
>
>  > Objects may throw Throwables in their unserialization handlers.
>
> Further indicating that you are not guaranteed to only get Exception or
> only get Error out of it.
>
> >
> >> Likewise I don't see how catching ErrorException specifically is useful
> >> when '__unserialize()' might throw arbitrary Throwables.
> >
> >
> > It's useful when all you need is eg to care about the exceptions thrown
> by
> > a custom error handler, and want to ignore the other ones (because other
> > exceptions aren't part of the current domain layer.)
> >
>
> Basically see response to previous quote.
>
> >> Quoting myself, because the examples didn't answer the question: "I am
> >> unable to think of a situation where the input data is well-defined
> >> enough to reliably throw a specific type of exception, but not
> >> well-defined enough to successfully unserialize".
> >>
> >> And don't get me wrong: I'm not attempting to say that it is appropriate
> >> to break logic that I personally consider wrong without justification. I
> >> believe the benefits of having the unified Exception outweigh the
> >> drawbacks of introducing a behavioral change in code that is already
> >> subtly broken depending on the exact input value passed to
> unserialize().
> >>
> >
> > The thoughts you describe might make sense as a rule of thumb principle
> > when you write your code. It might even be upgraded to a best practice.
> > Generally turning an Error to an Exception is not by my book, but I'm not
> > arguing about this. My point is : /!\ BC break ahead, do not cross /!\
> >
> > If you look at the gist I linked before, wrapping all throwables would
> > break most if not all the cases I listed. That's bad numbers, and I don't
> > think this is specific in any way to the Symfony codebase.
> >
>
> Based on my analysis and understanding of the linked examples, the only
> one that *might* break (I don't fully understand it), is the one in
> src/Symfony/Component/VarExporter/Internal/Registry.php. The others will
> either no change or change to be more correct.
>
> >
> >> This is similar to the attribute syntax breaking hash comments that
> >> start with a square bracket without any space in the middle. Or the '@@'
> >> attribute syntax that was also proposed. It would have broken code
> >> containing duplicated error suppression operators.
> >>
> >> Similarly to the attribute breakage, it's also reasonably easy to find
> >> possibly affected logic by grepping for 'unserialize('.
> >>
> >
> > Sorry to disagree, that's a very different sort of break. One that
> changes
> > the behavior of perfectly fine apps.
> >
>
> Disagreeing is fine. Without disagreement we would not need the RFC
> process. However I disagree on the "perfectly fine" part. I believe that
> some of the Symfony examples are already subtly broken (the
> DateTimeImmutable part).
>

As you saw, the BC break is real. That's what I wanted to highlight and
what worries me for the community at large. This is not some niche BC break.

There is also an argument that is missing in this conversation and that
Yasuo made (in another thread by mistake I guess?):

In general, unserializing external serialised data is the same as "Read and
> execute remote PHP code"
> (Executing arbitrary code via memory corruption is trivial task with
> unserialize())
>
> Therefore, developers must validate serialized data integrity at least
> with HMAC always, so that serialized data 

Re: [PHP-DEV] RFC [Discussion]: Improve unserialize() error handling

2022-09-28 Thread Tim Düsterhus

Hi

On 9/28/22 13:41, Christoph M. Becker wrote:

Predicting people's second-place choice is risky business.  This assumption 
seems logical on its face, but I'm sure there are people that will buck your 
expectations.


The reasoning is that unless “E_WARNING in 8.x without future decision” 
receives more than 50%, more than 50% prefer an Exception no later than 9.0. 
Unless “UnserializationFailedException in 8.x” receives more than 50%, more 
than 50% prefer no Exception in 8.x.


If you want to go that route, I'd go all the way to an RCV vote and be done 
with it.  Or else just make an executive decision as the RFC author and let the 
chips fall where they may.


I'm generally not too happy with secondary votes.  Sometimes you only
support the primary vote for certain secondary options; to "be sure"
that another secondary option won't "win", you'd need to vote "no" on
the primary choice.

I'd prefer a single vote with pre-selected details.  I don't have any
particular preference in this case, though.



Would the increase of E_NOTICE to E_WARNING for the two cases that 
currently emit E_NOTICE be something that even requires a vote or is 
this something that can simply be decided by merging a PR? [1] In that 
case the second and third vote could be simplified to "Convert E_* to 
Exception" with the regular 2/3 majority required.


[1] I would hope that unifying E_NOTICE/E_WARNING to E_WARNING 
everywhere is uncontroversial.


Best regards
Tim Düsterhus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC [Discussion]: Improve unserialize() error handling

2022-09-28 Thread Christoph M. Becker
On 27.09.2022 at 22:11, Larry Garfield wrote:

> On Tue, Sep 27, 2022, at 3:01 PM, Tim Düsterhus wrote:
>
>> Thank you, I thought about what to do here and I've adjusted the options
>> in the "increase to what" vote to make this a 3-way vote:
>>
>> https://wiki.php.net/rfc/improve_unserialize_error_handling#increasing_the_severity_of_existing_warningsnotices
>>
>> Do you believe that my reasoning with regard to the interpretation of
>> the vote's results is sound? A ranked choice vote should not necessary
>> here, because the three options follow a natural order with regard to
>> severity/possible breakage.
>
> Predicting people's second-place choice is risky business.  This assumption 
> seems logical on its face, but I'm sure there are people that will buck your 
> expectations.
>
>> The reasoning is that unless “E_WARNING in 8.x without future decision” 
>> receives more than 50%, more than 50% prefer an Exception no later than 9.0. 
>> Unless “UnserializationFailedException in 8.x” receives more than 50%, more 
>> than 50% prefer no Exception in 8.x.
>
> If you want to go that route, I'd go all the way to an RCV vote and be done 
> with it.  Or else just make an executive decision as the RFC author and let 
> the chips fall where they may.

I'm generally not too happy with secondary votes.  Sometimes you only
support the primary vote for certain secondary options; to "be sure"
that another secondary option won't "win", you'd need to vote "no" on
the primary choice.

I'd prefer a single vote with pre-selected details.  I don't have any
particular preference in this case, though.

--
Christoph M. Becker

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php