Hi

On 4/10/23 21:50, Niels Dossche wrote:
The suggested optimization of "the input is overwritten with the output" would 
then also allow to avoid introducing reference parameters just for optimization purposes. 
The sort() family comes to my mind and also the shuffle() function. 
Randomizer::shuffleArray() already returns a copy and thus would benefit from the proposed 
optimization for $a = $r->shuffleArray($a).


I did extend my optimization since the first time I posted it here.
It can handle two cases:
- $x = array_something($x, ...) like I previously showed with array_merge
- $x = array_something(temporary_array_with_refcount_1,...) which is new

There is one caveat with the first optimization: it is only safe if we know for 
sure no exception can happen during array modification.
Because, if an exception happens during modification, then the changed array is 
already visible to the exception handler, but this isn't allowed because the 
assignment didn't happen yet.
This "exception problem" does not happen for the second optimization though.

I see. That makes stuff certainly more complicated, because an exception can also arise in an error handler (i.e. for any warnings and notices).

It's also not just about visibility to the exception handler, but also "incomplete modifications" in cases like these:

    try {
        $foo = something($foo);
    } catch (\Exception) {}

    // $foo might or might not be fully modified.

So I looked if it was possible to do the optimization for shuffleArray.
It is only possible for the second case, because I see some EG(exception) 
checks inside php_array_data_shuffle().
Unless we can determine upfront which random algorithms have an exception-free 
range function.

For the internal engines this is easy (when ignoring extensions):

- Mt19937
- PcgOneseq128XslRr64
- Xoshiro256StarStar

... are all infallible.

- Secure

... is fallible (it doesn't fail in practice on modern OSes, though [1])

Any userland engine (engine_user.c) can do anything it wants, of course. Unfortunately with Secure being fallible, this optimization is of little use in practice. The same would likely be true for a non-reference sorting function, because of incomparable values and userland comparison handlers that can all kinds of unsafe stuff.

Best regards
Tim Düsterhus

[1] For a future major we *might* be able to make a working CSPRNG a hard requirement.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to