On 25/04/2022 10:33, Craig Francis wrote:
The fact that internal functions have parameter parsing behaviour that is
almost impossible to implement in userland, and often not even consistent
between functions, is a wart of engine internals, not a design decision.
Bit of a tangent, but do you have some examples? would be nice to clean some of
these up, or at least have them in mind as we discuss this RFC.
Fundamentally, the internal parameter handling system (ZPP) is
completely separate from the way function signatures work in userland,
and evolved based on a different set of requirements. The emphasis of
ZPP is on unwrapping zval structs to values that can be manipulated
directly in C; so, for instance, it has always had support for integer
parameters. Since 7.0, userland signatures have evolved an essentially
parallel set of features with an emphasis on designing a consistent and
useful dynamic typing system.
Increasingly, ZPP is being aligned with the userland language, which
also allows reflection information to be generated based on PHP stubs.
For instance:
* Making rejected parameters throw TypeError rather than raise a Warning
and return null
* Giving optional parameters an explicit default in the signature rather
than inspecting the argument count
* Using union types, rather than ad hoc if/switch on zval type
The currently proposed change to how internal functions handle nulls in
9.0 is just another part of that process - the userland behaviour is
well-established, and we're making the ZPP behaviour match.
Off the top of my head, I don't know what other inconsistencies remain,
but my point was that in every case so far, internal functions have been
adapted to match userland, not vice versa.
So I'll spend 1 more... I think it's fair to say that developers using
`strict_types=1` are more likely to be using static analysis; and if
`strict_types=1` is going to eventually disappear, those developers won't lose
any functionality with the stricter checking being done by static analysis,
which can check all possible variable types (more reliable than runtime), and
(with the appropriate level of strictness) static analysis can do things like
rejecting the string '5' being passed to an integer parameter and null being
passed to a non-nullable parameter.
There's an unhelpful implication here, and in your discussion of
testing, that PHP users can be divided into two camps: those who check
program correctness with static analysis tools, unit tests, etc; and
those who don't care about program correctness.
Instead, how about we think about those who are writing new code and
want PHP to tell them early when they do something silly; and those who
are maintaining large code bases and have to deal with compatibility
problems. Neither of these groups is helped enough by static analysers -
as you've rightly pointed out elsewhere, static checks are *not*
reliable in a dynamic language, and are not likely to be built-in any
time soon.
I'm by no means the strongest advocate of strictness in PHP - I think
there is a risk of throwing out good features with the bad. But I would
love to see strict_types=1 become unnecessary - not because "everyone's
running static analysers anyway, so who cares" but because the default
behaviour provides a good balance of safety and usability.
That makes me very hesitant to use the strict_types modes as a crutch
for this compatibility break - it only puts off the question of what we
think the sensible behaviour actually is.
Thank you; and you're right, if you write new code today, you could do that,
but that assumes you don't need to tell the difference between an empty value
vs a missing value
As I've said multiple times now, as soon as you pass it to a function
that doesn't have specific handling for nulls, you lose that distinction
anyway. There is literally zero difference in behaviour between "$foo =
htmlspecialchars($_GET['foo'] ?? null)" and "$foo =
htmlspecialchars($_GET['foo'] ?? '')".
Telling users when they've passed null to a non-nullable parameter is
precisely about *preserving* that distinction: if you want null to mean
something specific, treating it as a string is a bug.
But, updating existing code, while that would make automated updates easier, it's likely
to cause problems, because you're editing the value source, with no idea about checks
later on (like your example which looks for NULL)... and that's why an automated update
of existing code would have more luck updating the sinks rather than the sources (e.g. it
knows which sinks are expecting a string, so it can add in a `strval($var)`, or `(string)
$var`, or `$var ?? ""`).
That's a fair point, although "sinks" are often themselves the next
"source", which is what makes static analysis possible as often as it is.
Despite all of the above, I am honestly torn on this issue. It is a
disruptive change, and I'm not a fan of errors for errors' sake; but I
can see the value in the decision made back in 7.0 to exclude nulls by
default.
Regards,
--
Rowan Tommins
[IMSoP]
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php