Hi On 10/04/2023 22:11, Tim Düsterhus wrote: > Hi > > On 4/10/23 21:50, Niels Dossche wrote: >>> The suggested optimization of "the input is overwritten with the output" >>> would then also allow to avoid introducing reference parameters just for >>> optimization purposes. The sort() family comes to my mind and also the >>> shuffle() function. Randomizer::shuffleArray() already returns a copy and >>> thus would benefit from the proposed optimization for $a = >>> $r->shuffleArray($a). >>> >> >> I did extend my optimization since the first time I posted it here. >> It can handle two cases: >> - $x = array_something($x, ...) like I previously showed with array_merge >> - $x = array_something(temporary_array_with_refcount_1,...) which is new >> >> There is one caveat with the first optimization: it is only safe if we know >> for sure no exception can happen during array modification. >> Because, if an exception happens during modification, then the changed array >> is already visible to the exception handler, but this isn't allowed because >> the assignment didn't happen yet. >> This "exception problem" does not happen for the second optimization though. > > I see. That makes stuff certainly more complicated, because an exception can > also arise in an error handler (i.e. for any warnings and notices). >
Right. Thanks for pointing this out! I completely forgot about this. That's unfortunate... So far only the array_unique optimization will have trouble with this. I added a check now that checks if a user handler is installed. In any case, if anyone is interested, I created a PR against my fork (https://github.com/nielsdos/php-src/pull/5) where I'm tinkering with the optimization idea. > It's also not just about visibility to the exception handler, but also > "incomplete modifications" in cases like these: > > try { > $foo = something($foo); > } catch (\Exception) {} > > // $foo might or might not be fully modified. > >> So I looked if it was possible to do the optimization for shuffleArray. >> It is only possible for the second case, because I see some EG(exception) >> checks inside php_array_data_shuffle(). >> Unless we can determine upfront which random algorithms have an >> exception-free range function. > > For the internal engines this is easy (when ignoring extensions): > > - Mt19937 > - PcgOneseq128XslRr64 > - Xoshiro256StarStar > > ... are all infallible. > > - Secure > > ... is fallible (it doesn't fail in practice on modern OSes, though [1]) > > Any userland engine (engine_user.c) can do anything it wants, of course. > Unfortunately with Secure being fallible, this optimization is of little use > in practice. The same would likely be true for a non-reference sorting > function, because of incomparable values and userland comparison handlers > that can all kinds of unsafe stuff. > That's unfortunate. Then it does indeed seem like the optimization will not work well for your use case :/ > Best regards > Tim Düsterhus > > [1] For a future major we *might* be able to make a working CSPRNG a hard > requirement. Kind regards Niels -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php