Hi

On 10/04/2023 22:11, Tim Düsterhus wrote:
> Hi
> 
> On 4/10/23 21:50, Niels Dossche wrote:
>>> The suggested optimization of "the input is overwritten with the output" 
>>> would then also allow to avoid introducing reference parameters just for 
>>> optimization purposes. The sort() family comes to my mind and also the 
>>> shuffle() function. Randomizer::shuffleArray() already returns a copy and 
>>> thus would benefit from the proposed optimization for $a = 
>>> $r->shuffleArray($a).
>>>
>>
>> I did extend my optimization since the first time I posted it here.
>> It can handle two cases:
>> - $x = array_something($x, ...) like I previously showed with array_merge
>> - $x = array_something(temporary_array_with_refcount_1,...) which is new
>>
>> There is one caveat with the first optimization: it is only safe if we know 
>> for sure no exception can happen during array modification.
>> Because, if an exception happens during modification, then the changed array 
>> is already visible to the exception handler, but this isn't allowed because 
>> the assignment didn't happen yet.
>> This "exception problem" does not happen for the second optimization though.
> 
> I see. That makes stuff certainly more complicated, because an exception can 
> also arise in an error handler (i.e. for any warnings and notices).
> 

Right. Thanks for pointing this out! I completely forgot about this.
That's unfortunate... So far only the array_unique optimization will have 
trouble with this. I added a check now that checks if a user handler is 
installed.

In any case, if anyone is interested, I created a PR against my fork 
(https://github.com/nielsdos/php-src/pull/5) where I'm tinkering with the 
optimization idea.

> It's also not just about visibility to the exception handler, but also 
> "incomplete modifications" in cases like these:
> 
>     try {
>         $foo = something($foo);
>     } catch (\Exception) {}
> 
>     // $foo might or might not be fully modified.
> 
>> So I looked if it was possible to do the optimization for shuffleArray.
>> It is only possible for the second case, because I see some EG(exception) 
>> checks inside php_array_data_shuffle().
>> Unless we can determine upfront which random algorithms have an 
>> exception-free range function.
> 
> For the internal engines this is easy (when ignoring extensions):
> 
> - Mt19937
> - PcgOneseq128XslRr64
> - Xoshiro256StarStar
> 
> ... are all infallible.
> 
> - Secure
> 
> ... is fallible (it doesn't fail in practice on modern OSes, though [1])
> 
> Any userland engine (engine_user.c) can do anything it wants, of course. 
> Unfortunately with Secure being fallible, this optimization is of little use 
> in practice. The same would likely be true for a non-reference sorting 
> function, because of incomparable values and userland comparison handlers 
> that can all kinds of unsafe stuff.
> 

That's unfortunate. Then it does indeed seem like the optimization will not 
work well for your use case :/

> Best regards
> Tim Düsterhus
> 
> [1] For a future major we *might* be able to make a working CSPRNG a hard 
> requirement.

Kind regards
Niels

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to