> On 30 Apr 2022, at 18:05, Rowan Tommins <rowan.coll...@gmail.com> wrote:
> 
> On 27/04/2022 16:51, Craig Francis wrote:
>> Forgive this primitive example, but this shows `$name` being used in three 
>> different ways, where an automated tool cannot simply change line 1 so it 
>> doesn't return a NULL, because it would break the Register link.
>> 
>> ```
>> $name = $request->get('name');
>> 
>> if (trim($name) === '') { // Contains non-whitespace characters; so not "", 
>> " ", NULL, etc
>>   $where_sql[] = 'name LIKE ?';
>>   $where_val[] = $name;
>> }
>> 
>> echo '
>>   <form action="./" method="get">
>>     <label>
>>       Search
>>       <input type="search" name="name" value="' . htmlspecialchars($name) . 
>> '">
>>     </label>
>>     <input type="submit" value="Go">
>>   </form>';
>> 
>> if ($name !== NULL) {
>>   $register_url = '/admin/accounts/add/?name=' . urlencode($name);
>>   echo '
>>     <p><a href="' . htmlspecialchars($register_url) . '">Add 
>> Account</a></p>';
>> }
>> ```
>> 
>> In this example, why does `trim()` and `htmlspecialchars()` justify a fatal 
>> Type Error?
> 
> 
> Honestly, I fail to see how the inconsistent null handling in this example is 
> anything other than a bug:


My example was only trying to show how an automated tool cannot simply update 
the `$name` source (line 1), because it will not be able to easily tell what 
happens later (especially if the code is split into multiple files)... whereas 
an automated tool will find it much easier to do this at the sinks:

```
- if (trim($name) === '') {
+ if (trim((string) $name) === '') {
```

Which is how the default NULL from `getConfigData()` was handled with 
PHPCompatibility:

https://github.com/PHPCompatibility/PHPCompatibility/commit/ab7b0b82d95c82e62f9cffb0f1960f3ccbf0805e
 
<https://github.com/PHPCompatibility/PHPCompatibility/commit/ab7b0b82d95c82e62f9cffb0f1960f3ccbf0805e>

Or maybe:

```
- [...] value="' . htmlspecialchars($name) . '">
+ [...] value="' . htmlspecialchars($name ?? '') . '">
```

Which is how "Vaimo Composer Patches" might "fix" the NULL from 
`extractSingleValue()`:

https://github.com/vaimo/composer-patches/pull/96/files 
<https://github.com/vaimo/composer-patches/pull/96/files>

https://github.com/vaimo/composer-patches/blob/cbceb44dfad9e37f18e319125b036a10496ea094/src/Patch/SourceLoaders/PatchesSearch.php#L234
 
<https://github.com/vaimo/composer-patches/blob/cbceb44dfad9e37f18e319125b036a10496ea094/src/Patch/SourceLoaders/PatchesSearch.php#L234>

This is how Laravel Blade handles NULL:

https://github.com/laravel/framework/blob/ab1506091b9f166b312b3990d07b2e21d971f2e6/src/Illuminate/Support/helpers.php#L119
 
<https://github.com/laravel/framework/blob/ab1506091b9f166b312b3990d07b2e21d971f2e6/src/Illuminate/Support/helpers.php#L119>

But, on the basis that PHP considers it a problem if NULL is passed to these 
functions, and should not be allowed (blocking will make better code, 
somehow)... maybe someone should ask for this PR to be reverted? It's only one 
library, so they should be able to easily fix all of the code affected by this, 
right? :-)

https://github.com/laravel/framework/pull/36262 
<https://github.com/laravel/framework/pull/36262>

Interestingly Symphony Twig (which uses NULL for undefined `$context` values), 
preserves NULL when passed to `twig_escape_filter()`... but that's typically 
provided directly to `echo`, where NULL is accepted and coerced to an empty 
string :-)

```
echo twig_escape_filter($this->env, ($context["v"] ?? null), "html", null, 
true);
```

https://github.com/twigphp/Twig/blob/b4d6723715da57667cca851051eba3786714290d/src/Extension/EscaperExtension.php#L195
 
<https://github.com/twigphp/Twig/blob/b4d6723715da57667cca851051eba3786714290d/src/Extension/EscaperExtension.php#L195>

On the basis that PHP considers it a problem if NULL is passed to these 
functions, maybe `echo(string ...$expressions)` and `print(string $expression)` 
should also deprecate (and eventually fatal error) when they receive NULL? That 
would be fun :-)

None of these library changes (to support 8.1) are making their code easier to 
read, nor are they improving the code quality.

Anyway, I'm going on a pointless tangent... the important bit is that many 
frameworks already (and will probably continue to) return NULL by default, 
doing so has been fine (and can be useful in some cases)... and now that NULL 
is going to be rejected, I can only see that creating an upgrade problem.



> * If strings containing only spaces are considered empty, it's probably a 
> mistake that they're not trimmed everywhere. But of course adding $name = 
> trim($name) would remove all distinction between null and ''.


Bit of a tangent, but someone can use `trim()` to check if a value is worth 
saving (or using the SELECT query), but will still want to preserve exactly 
what the user wrote (e.g. taking your name as an example, searching for "Tom" 
vs " Tom" will return your record for both, but not my step-brother Tom).

Either way, why should `trim()` trigger a fatal Type Error if it receives NULL? 
Does that actually represent problem, considering it's fine receiving an 
integer?



> * If null is equivalent to an empty string when deciding whether to run the 
> SQL, why is it not also equivalent to an empty string when rendering the 
> register link? The current logic means that clicking "Go" causes a register 
> link to appear, with an empty name on the query string, which doesn't seem 
> like useful functionality.


While I wasn't intending it to be a real example... maybe imagine the original 
code did not run the DB query when the field is left blank (or only contains 
whitespace characters), probably for performance reasons... and later, after 
someone complains about duplicate accounts, they put in a small change to 
ensure the admin did at least one search, but for anti-frustration reasons, 
they allowed the admin to just press [enter]?



> * On the other hand, if NULL is considered a different state, why is there no 
> behaviour distinguishing it around the SQL logic? It seems likely that an 
> else clause would be added there, perhaps "else { echo 'You must enter a 
> name.'; }" In the current code, that would appear for both null and empty 
> strings, which is probably a mistake.
> 
> I think that actually demonstrates quite nicely that most code would benefit 
> from treating "string" and "?string" as strictly different types, and either 
> defaulting to an empty string explicitly and early, or considering null at 
> every step. Simultaneously relying on null values being preserved, and them 
> being silently coerced, leads to fragile code where it's not clear where 
> nulls are deliberately handled as empty strings, and where they've simply 
> been forgotten about.


It wasn't supposed to demonstrate anything other than the problem an automated 
tool would have.

But, on the basis that many developers don't seem to be "considering null at 
every step" (going on the roughly 15% of scripts using `strict_types=1`)... 
should frameworks default to returning an empty string instead? If so, how much 
existing code will break if that was to happen? Or should all of these 
developers specify an empty string as their default every time they ask for a 
user-value? In either case, it's a lot of work to fix the things that break, 
and I still cannot see the value in doing so.



>>> Telling users when they've passed null to a non-nullable parameter is 
>>> precisely about *preserving* that distinction: if you want null to mean 
>>> something specific, treating it as a string is a bug.
>> 
>> I don't think that represents a bug, we are talking about a system that 
>> takes user input (so often nullable), supports coercion with other simple 
>> types, and supports NULL coercion in other contexts (e.g. string concat).
> 
> 
> Read the sentence you replied to again: IF you want to treat null as distinct 
> from '', THEN failing to do so is a bug. If, on the other hand, you just want 
> to take nullable user input and handle it CONSISTENTLY as a string, then it's 
> fine to explicitly default to '' AT SOURCE.


But we're talking about existing code, and exiting frameworks, who use NULL by 
default; and it's been coerced perfectly fine (until 8.1, or for user defined 
functions that have specified a non-nullable type).

Maybe the frameworks should have taken WordPress as their example, and used an 
empty string by default?

And we're back to the question, what is actually wrong with passing NULL to the 
mentioned functions? I can only see the argument of possibly preserving NULL 
(with different backwards compatibility issues, and I suspect better handled 
with your "?|>" suggestion); or those developers who prefer a very strict 
environment, with no type coercion, where they can treat NULL as an invalid 
value (but probably should be using static analysis to reliably check their 
code never passes NULL to these functions or performs any coercion).



>>> Despite all of the above, I am honestly torn on this issue. It is a 
>>> disruptive change, and I'm not a fan of errors for errors' sake; but I can 
>>> see the value in the decision made back in 7.0 to exclude nulls by default.
>> 
>> Thanks Rowan, I'll just add that I think the fatal Type Error for NULL will 
>> be much more disruptive, and I would rather relax that requirement for user 
>> defined functions, so NULL coercion works in all contexts.
> 
> 
> I think the main difference between our positions is that I believe that if 
> PHP's type system was designed from scratch today, null would not be silently 
> coerced in these situations. So while I agree that the change will be 
> disruptive, I disagree with your position that it brings no benefit.


But what is that benefit? I'm sorry, but I really don't see it.

I'm going on the basis that you're ok with numbers in strings being coerced to 
integers/floats (which I also see as being useful, because you're right, most 
inputs are strings)... but you're not ok with NULL being coerced (which is also 
common, because values aren't guaranteed to be provided by the user, and NULL 
is typically the default).



> On 27/04/2022 18:34, Craig Francis wrote:
>> But I'm wondering, is it only one function? and assuming it's a problem, 
>> could we use `Z_PARAM_LONG_OR_NULL()` and specifically throw an exception 
>> when either parameter is NULL, like the `max < min` check? On the basis that 
>> I'd rather have one extra check for this function, and keep NULL coercion 
>> working everywhere else (i.e. where it's fine).
> 
> 
> Well, since it's one of three examples in Guilliam's e-mail, the answer to 
> the first question seems rather trivially "no", unless I'm missing something?


Sorry, I can only see two examples... `get_class()`, which is more of an 
example of UNKNOWN being used (and NULL has not been "allowed as of PHP 
7.2")... and `mt_rand()`, which I mentioned in the RFC, because you provided a 
good example of NULL being a potential problem, not that I can find anyone 
doing this publicly:

https://grep.app/search?q=mt_rand%28NULL&filter[lang][0]=PHP 
<https://grep.app/search?q=mt_rand%28NULL&filter%5Blang%5D%5B0%5D=PHP>

https://grep.app/search?q=mt_rand%5Cs%2A%5C%28%5Cs%2ANULL&regexp=true&filter[lang][0]=PHP
 
<https://grep.app/search?q=mt_rand%5Cs%2A%5C%28%5Cs%2ANULL&regexp=true&filter%5Blang%5D%5B0%5D=PHP>

https://www.google.com/search?q=%22mt_rand+NULL+NULL%22 
<https://www.google.com/search?q=%22mt_rand+NULL+NULL%22>



> As for the second question, certainly we could add specific prohibitions to 
> null on a case by case basis, but that's basically equivalent to your 
> previous suggestion of explicitly allowing null on a case by case basis, and 
> doesn't really answer the question of what the default behaviour should be - 
> especially bearing in mind that any default should apply to both built-in and 
> user-defined functions.



I don't want oddities either, I was just exploring the possibility of a 
developer accidentally writing `mt_rand(NULL, NULL)`, in the current 
environment (where the NULL's are coerced to the integer 0); and if it's a 
problem worth protecting against, how we could handle it.

Craig

Reply via email to