Re: [PHP-DEV] include cleanup RFC declined
On 15 February 2023 15:18:31 GMT, Max Kellermann wrote: >On 2023/02/01 13:13, Max Kellermann wrote: >> Voting starts now, please vote on my RFC: >> https://wiki.php.net/rfc/include_cleanup > >Hi, > >voting of https://wiki.php.net/rfc/include_cleanup has ended today at >15 UTC. > >The majority of voters (52%) voted "Yes" on the primary vote - "Should >#include directives be cleaned up?" - but the required supermajority >for a primary vote was not met. Therefore, the primary vote is >declined. (snip) >Interestingly, of all things, the most intrusive vote ("Is it allowed >to split a large header to reduce dependencies?") got accepted by a >supermajority. I'll assemble a PR with just the header splitting >commits and submit it for merging. Secondary votes are irrelevant if the primary one doesn't pass. cheers Derick -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] Working With Substrings
On 15 February 2023 13:03:39 GMT, Lydia de Jongh wrote: >Hi, >Very interesting topic! On which I have NO experience > > > >Op wo 15 feb. 2023 om 08:02 schreef Rowan Tommins : > >> On 15 February 2023 05:18:50 GMT, Rowan Tommins >> wrote: >> >My instinct was that it could just be a built-in class, with an internal >> pointer to a zend_string that's completely invisible to userland. Something >> like how the SimpleXML and DOM objects just point into a libxml parse >> result. >> >> To make this a bit more concrete, what I was picturing was that instead of >> this example: >> >> str_splice($this->pagemap[$pagepos][0], $x2, $size2, $data, $x, $size); >> >> You would have something like this: >> >> // Wrap an existing zend_string in an object >> $destBuffer = Buffer:: fromString($this->pagemap[$pagepos][0]); >> // Similar, but also track start and end offsets >> $sourceBuffer = Buffer::fromSubString($data, $x, $size); >> // Now do the actual memory copy >> $destBuffer->splice($x2, $size2, $sourceBuffer); >> >> >> > >In some other languages every variable IS an object. by default. > >As far as I understand, the code above is meant as internal. >But what if any variable is a small object. >Has this been ever considered? Or would it use too much performance? > >$oString = 'my text'; > >$oString->toUpper(); > >echo $oString; // 'MY TEXT' > > > >Greetz, Lydia https://wiki.php.net/rfc/unicode_text_processing And yes, that won't be as fast as just calling strtoupper. cheers Derick -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] Working With Substrings
On Wed, Feb 15, 2023, at 1:35 PM, Thomas Hruska wrote: > On 2/15/2023 6:03 AM, Lydia de Jongh wrote: >> Hi, >> Very interesting topic! On which I have NO experience >> >> >> In some other languages every variable IS an object. by default. >> >> As far as I understand, the code above is meant as internal. >> But what if any variable is a small object. >> Has this been ever considered? Or would it use too much performance? >> >> $oString = 'my text'; >> >> $oString->toUpper(); >> >> echo $oString; // 'MY TEXT' > > The above represents a significant amount of scope creep but it's > certainly interesting. So let's explore it a bit and gauge the response. > > The above code will currently throw an error. Significant global > adoption of such a change will take a fairly long time - probably a > decade, maybe longer. > > AFAIK, there is nothing technically preventing the core Zend engine from > accepting a -> token after a string variable and calling a function that > performs an inline modification of the string. > > As a brief test, I just ran the example code through PHP and got: "PHP > Fatal error: Uncaught Error: Call to a member function toUpper() on > string in test.php:4" The error message shows that Zend engine clearly > already recognizes toUpper() as an attempted function/method call on a > string...it just doesn't know what to do with it. So the logic for > supporting -> method calls on strings appears, at least from my very > brief test, to already be mostly in place. Nice! *snip* What you're describing here is "scalar methods", which has been discussed on and off for many years. The idea has its proponents, but also its detractors. (I'm in the detractor camp, personally, as I think there are better, more flexible options.) I would strongly recommend not allowing "faster string manipulation" to scope creep into scalar methods, as that will almost guarantee that it never comes to fruition. :-) IF scalar methods were to happen, they should happen on their own. --Larry Garfield -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] Working With Substrings
On 2/15/2023 6:03 AM, Lydia de Jongh wrote: Hi, Very interesting topic! On which I have NO experience In some other languages every variable IS an object. by default. As far as I understand, the code above is meant as internal. But what if any variable is a small object. Has this been ever considered? Or would it use too much performance? $oString = 'my text'; $oString->toUpper(); echo $oString; // 'MY TEXT' The above represents a significant amount of scope creep but it's certainly interesting. So let's explore it a bit and gauge the response. The above code will currently throw an error. Significant global adoption of such a change will take a fairly long time - probably a decade, maybe longer. AFAIK, there is nothing technically preventing the core Zend engine from accepting a -> token after a string variable and calling a function that performs an inline modification of the string. As a brief test, I just ran the example code through PHP and got: "PHP Fatal error: Uncaught Error: Call to a member function toUpper() on string in test.php:4" The error message shows that Zend engine clearly already recognizes toUpper() as an attempted function/method call on a string...it just doesn't know what to do with it. So the logic for supporting -> method calls on strings appears, at least from my very brief test, to already be mostly in place. Nice! Supporting this would likely result in two distinct internal functions that would have to be maintained. One inline string-object method variant that can avoid copy-on-write (e.g. $var->toUpper()) and one that only does copy-on-write (e.g. strtoupper()). Repeat that for all of the existing string functions. Alternatively, the main function body for each function could move into its own function that has a parameter for distinguishing the difference between "function (copy) vs. method (possibly inline)" calls, which would create some additional overhead for the existing ext/standard/string functions. The average performance loss for regular function calls would need to be benchmarked. Nobody likes seeing performance losses even if they end up being a less than 1% reduction. C function calls are way faster than PHP userland but they still have some overhead. This is just a thought exploration of how it could be implemented. With this approach, a $var->repeat("\x00", 4096, 50) could work to start at position 50 and write 4,096 zero bytes. But that again adds a parameter for an offset. But maybe $var[50...4096 + 50]->repeat("\x00", 4096) could solve that? That's a bit awkward to look at, requires adding range support to strings (and maybe arrays too because you know someone will want that as well), and probably breaks a lot of things. However, I'm not sure this idea can be used with virtual buffers that expressly set their size. zend_string (how strings are stored) simply doesn't have support for it. There's a length member but no size member. Internally, the zend_string implementation assumes length + 1 = size. If you got this far and know how PHP, C, and CPU hardware works, you can skip ahead to the last two paragraphs. The next few paragraphs delves into some details to try to explain to Lydia (and others who are following along) what's going on under the hood with why I focused on substrings. Apologies in advance for my rambling. Avoiding copy-on-write requires the internal reference count total (refcount) to effectively be 1. Reference counting helps reduce the number of times a copy is made. Fewer copies generally results in faster performance. A refcount of 1 does happen more frequently when inside a loop. In real world code, depending on what is being done, the first loop iteration might have many references to a string while the second loop iteration that is operating on the same data might have a refcount of just one. This situation happens frequently enough to consider inline options. Memory allocation is one of the slower operations in computer programs. Ideally, a program makes as few allocation requests to the system as possible. PHP avoids making system calls to allocate memory by pooling reclaimed memory into multiple memory pools for reuse. Copying strings from one buffer to another buffer is also avoided by leveraging reference counting. However, this creates the scenario where every modified string has its buffer copied from one buffer to the next. Let's take this fairly common but simple code to see what happens in Zend engine: $pos = strrpos($str, "/"); $str = substr($str, 0, $pos + 1); The above substr() results in one "logical" memory allocation and one logical free operation (whether it actually makes system calls to allocate/free memory is way beyond the scope of this paragraph) and one memory copy operation. We say we want the substring of a certain size, which allocates space to create a temporary copy that can hold
[PHP-DEV] include cleanup RFC declined
On 2023/02/01 13:13, Max Kellermann wrote: > Voting starts now, please vote on my RFC: > https://wiki.php.net/rfc/include_cleanup Hi, voting of https://wiki.php.net/rfc/include_cleanup has ended today at 15 UTC. The majority of voters (52%) voted "Yes" on the primary vote - "Should #include directives be cleaned up?" - but the required supermajority for a primary vote was not met. Therefore, the primary vote is declined. On the secondary vote "Is it allowed to document an #include line with a code comment?", 90% of all voters do not want to allow code comments on #include lines. To fix the PHP code base according to this decision, please consider merging https://github.com/php/php-src/pull/10472 The secondary vote "Is it allowed to forward-declare structs/unions/typedefs?" was clearly rejected as well; 87.5% of all voters thought forward declarations should not be allowed. There are numerous unnecessary forward declarations; several of these are removed by https://github.com/php/php-src/pull/10494 - please consider merging this PR for compliance with this decision. Interestingly, of all things, the most intrusive vote ("Is it allowed to split a large header to reduce dependencies?") got accepted by a supermajority. I'll assemble a PR with just the header splitting commits and submit it for merging. >From my minimal #include cleanup PR (https://github.com/php/php-src/pull/10410), I have removed all include comments. The RFC failed to meet the supermajority, but I'm not sure if that means that #include cleanups are now (or still?) forbidden. Having a majority, but no supermajority sounds like it's inconclusive, but I don't know what that means and how to proceed. Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] Working With Substrings
Hi, Very interesting topic! On which I have NO experience Op wo 15 feb. 2023 om 08:02 schreef Rowan Tommins : > On 15 February 2023 05:18:50 GMT, Rowan Tommins > wrote: > >My instinct was that it could just be a built-in class, with an internal > pointer to a zend_string that's completely invisible to userland. Something > like how the SimpleXML and DOM objects just point into a libxml parse > result. > > To make this a bit more concrete, what I was picturing was that instead of > this example: > > str_splice($this->pagemap[$pagepos][0], $x2, $size2, $data, $x, $size); > > You would have something like this: > > // Wrap an existing zend_string in an object > $destBuffer = Buffer:: fromString($this->pagemap[$pagepos][0]); > // Similar, but also track start and end offsets > $sourceBuffer = Buffer::fromSubString($data, $x, $size); > // Now do the actual memory copy > $destBuffer->splice($x2, $size2, $sourceBuffer); > > > In some other languages every variable IS an object. by default. As far as I understand, the code above is meant as internal. But what if any variable is a small object. Has this been ever considered? Or would it use too much performance? $oString = 'my text'; $oString->toUpper(); echo $oString; // 'MY TEXT' Greetz, Lydia