Re: [PHP-DEV] include cleanup RFC declined

2023-02-15 Thread Derick Rethans
On 15 February 2023 15:18:31 GMT, Max Kellermann  wrote:
>On 2023/02/01 13:13, Max Kellermann  wrote:
>> Voting starts now, please vote on my RFC:
>>  https://wiki.php.net/rfc/include_cleanup
>
>Hi,
>
>voting of https://wiki.php.net/rfc/include_cleanup has ended today at
>15 UTC.
>
>The majority of voters (52%) voted "Yes" on the primary vote - "Should
>#include directives be cleaned up?" - but the required supermajority
>for a primary vote was not met.  Therefore, the primary vote is
>declined.

(snip) 

>Interestingly, of all things, the most intrusive vote ("Is it allowed
>to split a large header to reduce dependencies?") got accepted by a
>supermajority.  I'll assemble a PR with just the header splitting
>commits and submit it for merging.

Secondary votes are irrelevant if the primary one doesn't pass. 

cheers
Derick 

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Working With Substrings

2023-02-15 Thread Derick Rethans
On 15 February 2023 13:03:39 GMT, Lydia de Jongh  wrote:
>Hi,
>Very interesting topic! On which I have NO experience 
>
>
>
>Op wo 15 feb. 2023 om 08:02 schreef Rowan Tommins :
>
>> On 15 February 2023 05:18:50 GMT, Rowan Tommins 
>> wrote:
>> >My instinct was that it could just be a built-in class, with an internal
>> pointer to a zend_string that's completely invisible to userland. Something
>> like how the SimpleXML and DOM objects just point into a libxml parse
>> result.
>>
>> To make this a bit more concrete, what I was picturing was that instead of
>> this example:
>>
>> str_splice($this->pagemap[$pagepos][0], $x2, $size2, $data, $x, $size);
>>
>> You would have something like this:
>>
>> // Wrap an existing zend_string in an object
>> $destBuffer = Buffer:: fromString($this->pagemap[$pagepos][0]);
>> // Similar, but also track start and end offsets
>> $sourceBuffer = Buffer::fromSubString($data, $x, $size);
>> // Now do the actual memory copy
>> $destBuffer->splice($x2, $size2, $sourceBuffer);
>>
>>
>>
>
>In some other languages every variable IS an object. by default.
>
>As far as I understand, the code above is meant as internal.
>But what if any variable is a small object.
>Has this been ever considered? Or would it use too much performance?
>
>$oString = 'my text';
>
>$oString->toUpper();
>
>echo $oString;  // 'MY TEXT'
>
>
>
>Greetz, Lydia

https://wiki.php.net/rfc/unicode_text_processing

And yes, that won't be as fast as just calling strtoupper. 

cheers
Derick

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Working With Substrings

2023-02-15 Thread Larry Garfield
On Wed, Feb 15, 2023, at 1:35 PM, Thomas Hruska wrote:
> On 2/15/2023 6:03 AM, Lydia de Jongh wrote:
>> Hi,
>> Very interesting topic! On which I have NO experience 
>> 
>> 
>> In some other languages every variable IS an object. by default.
>> 
>> As far as I understand, the code above is meant as internal.
>> But what if any variable is a small object.
>> Has this been ever considered? Or would it use too much performance?
>> 
>> $oString = 'my text';
>> 
>> $oString->toUpper();
>> 
>> echo $oString;  // 'MY TEXT'
>
> The above represents a significant amount of scope creep but it's 
> certainly interesting.  So let's explore it a bit and gauge the response.
>
> The above code will currently throw an error.  Significant global 
> adoption of such a change will take a fairly long time - probably a 
> decade, maybe longer.
>
> AFAIK, there is nothing technically preventing the core Zend engine from 
> accepting a -> token after a string variable and calling a function that 
> performs an inline modification of the string.
>
> As a brief test, I just ran the example code through PHP and got:  "PHP 
> Fatal error:  Uncaught Error: Call to a member function toUpper() on 
> string in test.php:4"  The error message shows that Zend engine clearly 
> already recognizes toUpper() as an attempted function/method call on a 
> string...it just doesn't know what to do with it.  So the logic for 
> supporting -> method calls on strings appears, at least from my very 
> brief test, to already be mostly in place.  Nice!

*snip*

What you're describing here is "scalar methods", which has been discussed on 
and off for many years.  The idea has its proponents, but also its detractors.  
(I'm in the detractor camp, personally, as I think there are better, more 
flexible options.)

I would strongly recommend not allowing "faster string manipulation" to scope 
creep into scalar methods, as that will almost guarantee that it never comes to 
fruition. :-)  IF scalar methods were to happen, they should happen on their 
own.

--Larry Garfield

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Working With Substrings

2023-02-15 Thread Thomas Hruska

On 2/15/2023 6:03 AM, Lydia de Jongh wrote:

Hi,
Very interesting topic! On which I have NO experience 


In some other languages every variable IS an object. by default.

As far as I understand, the code above is meant as internal.
But what if any variable is a small object.
Has this been ever considered? Or would it use too much performance?

$oString = 'my text';

$oString->toUpper();

echo $oString;  // 'MY TEXT'


The above represents a significant amount of scope creep but it's 
certainly interesting.  So let's explore it a bit and gauge the response.


The above code will currently throw an error.  Significant global 
adoption of such a change will take a fairly long time - probably a 
decade, maybe longer.


AFAIK, there is nothing technically preventing the core Zend engine from 
accepting a -> token after a string variable and calling a function that 
performs an inline modification of the string.


As a brief test, I just ran the example code through PHP and got:  "PHP 
Fatal error:  Uncaught Error: Call to a member function toUpper() on 
string in test.php:4"  The error message shows that Zend engine clearly 
already recognizes toUpper() as an attempted function/method call on a 
string...it just doesn't know what to do with it.  So the logic for 
supporting -> method calls on strings appears, at least from my very 
brief test, to already be mostly in place.  Nice!


Supporting this would likely result in two distinct internal functions 
that would have to be maintained.  One inline string-object method 
variant that can avoid copy-on-write (e.g. $var->toUpper()) and one that 
only does copy-on-write (e.g. strtoupper()).  Repeat that for all of the 
existing string functions.  Alternatively, the main function body for 
each function could move into its own function that has a parameter for 
distinguishing the difference between "function (copy) vs. method 
(possibly inline)" calls, which would create some additional overhead 
for the existing ext/standard/string functions.  The average performance 
loss for regular function calls would need to be benchmarked.  Nobody 
likes seeing performance losses even if they end up being a less than 1% 
reduction.  C function calls are way faster than PHP userland but they 
still have some overhead.  This is just a thought exploration of how it 
could be implemented.


With this approach, a $var->repeat("\x00", 4096, 50) could work to start 
at position 50 and write 4,096 zero bytes.  But that again adds a 
parameter for an offset.  But maybe $var[50...4096 + 50]->repeat("\x00", 
4096) could solve that?  That's a bit awkward to look at, requires 
adding range support to strings (and maybe arrays too because you know 
someone will want that as well), and probably breaks a lot of things.


However, I'm not sure this idea can be used with virtual buffers that 
expressly set their size.  zend_string (how strings are stored) simply 
doesn't have support for it.  There's a length member but no size 
member.  Internally, the zend_string implementation assumes length + 1 = 
size.



If you got this far and know how PHP, C, and CPU hardware works, you can 
skip ahead to the last two paragraphs.  The next few paragraphs delves 
into some details to try to explain to Lydia (and others who are 
following along) what's going on under the hood with why I focused on 
substrings.  Apologies in advance for my rambling.



Avoiding copy-on-write requires the internal reference count total 
(refcount) to effectively be 1.  Reference counting helps reduce the 
number of times a copy is made.  Fewer copies generally results in 
faster performance.  A refcount of 1 does happen more frequently when 
inside a loop.  In real world code, depending on what is being done, the 
first loop iteration might have many references to a string while the 
second loop iteration that is operating on the same data might have a 
refcount of just one.  This situation happens frequently enough to 
consider inline options.


Memory allocation is one of the slower operations in computer programs. 
Ideally, a program makes as few allocation requests to the system as 
possible.  PHP avoids making system calls to allocate memory by pooling 
reclaimed memory into multiple memory pools for reuse.  Copying strings 
from one buffer to another buffer is also avoided by leveraging 
reference counting.  However, this creates the scenario where every 
modified string has its buffer copied from one buffer to the next. 
Let's take this fairly common but simple code to see what happens in 
Zend engine:


$pos = strrpos($str, "/");
$str = substr($str, 0, $pos + 1);

The above substr() results in one "logical" memory allocation and one 
logical free operation (whether it actually makes system calls to 
allocate/free memory is way beyond the scope of this paragraph) and one 
memory copy operation.  We say we want the substring of a certain size, 
which allocates space to create a temporary copy that can hold 

[PHP-DEV] include cleanup RFC declined

2023-02-15 Thread Max Kellermann
On 2023/02/01 13:13, Max Kellermann  wrote:
> Voting starts now, please vote on my RFC:
>  https://wiki.php.net/rfc/include_cleanup

Hi,

voting of https://wiki.php.net/rfc/include_cleanup has ended today at
15 UTC.

The majority of voters (52%) voted "Yes" on the primary vote - "Should
#include directives be cleaned up?" - but the required supermajority
for a primary vote was not met.  Therefore, the primary vote is
declined.

On the secondary vote "Is it allowed to document an #include line with
a code comment?", 90% of all voters do not want to allow code comments
on #include lines.  To fix the PHP code base according to this
decision, please consider merging
https://github.com/php/php-src/pull/10472

The secondary vote "Is it allowed to forward-declare
structs/unions/typedefs?" was clearly rejected as well; 87.5% of all
voters thought forward declarations should not be allowed.  There are
numerous unnecessary forward declarations; several of these are
removed by https://github.com/php/php-src/pull/10494 - please consider
merging this PR for compliance with this decision.

Interestingly, of all things, the most intrusive vote ("Is it allowed
to split a large header to reduce dependencies?") got accepted by a
supermajority.  I'll assemble a PR with just the header splitting
commits and submit it for merging.

>From my minimal #include cleanup PR
(https://github.com/php/php-src/pull/10410), I have removed all
include comments.  The RFC failed to meet the supermajority, but I'm
not sure if that means that #include cleanups are now (or still?)
forbidden.  Having a majority, but no supermajority sounds like it's
inconclusive, but I don't know what that means and how to proceed.

Max

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Working With Substrings

2023-02-15 Thread Lydia de Jongh
Hi,
Very interesting topic! On which I have NO experience 



Op wo 15 feb. 2023 om 08:02 schreef Rowan Tommins :

> On 15 February 2023 05:18:50 GMT, Rowan Tommins 
> wrote:
> >My instinct was that it could just be a built-in class, with an internal
> pointer to a zend_string that's completely invisible to userland. Something
> like how the SimpleXML and DOM objects just point into a libxml parse
> result.
>
> To make this a bit more concrete, what I was picturing was that instead of
> this example:
>
> str_splice($this->pagemap[$pagepos][0], $x2, $size2, $data, $x, $size);
>
> You would have something like this:
>
> // Wrap an existing zend_string in an object
> $destBuffer = Buffer:: fromString($this->pagemap[$pagepos][0]);
> // Similar, but also track start and end offsets
> $sourceBuffer = Buffer::fromSubString($data, $x, $size);
> // Now do the actual memory copy
> $destBuffer->splice($x2, $size2, $sourceBuffer);
>
>
>

In some other languages every variable IS an object. by default.

As far as I understand, the code above is meant as internal.
But what if any variable is a small object.
Has this been ever considered? Or would it use too much performance?

$oString = 'my text';

$oString->toUpper();

echo $oString;  // 'MY TEXT'



Greetz, Lydia