On Sat, Jul 27, 2024, at 15:26, Christoph M. Becker wrote:
> On 15.02.2023 at 06:18, Rowan Tommins wrote:
> 
> > On 15 February 2023 02:35:42 GMT, Thomas Hruska <thru...@cubiclesoft.com> 
> > wrote:
> >
> >> On 2/14/2023 2:02 PM, Rowan Tommins wrote:
> >>
> >> I thought about that but didn't know how well it would be received nor, 
> >> perhaps more importantly, the direction it should take (i.e. a formal Zend 
> >> type in the engine, extending the existing zend_string type, a class, some 
> >> combination, or something else entirely).  All of the more advanced 
> >> options I came up with would have required some code changes to the PHP 
> >> source itself with a new data type being the most involved and probably 
> >> the most controversial.
> >
> > My instinct was that it could just be a built-in class, with an internal 
> > pointer to a zend_string that's completely invisible to userland. Something 
> > like how the SimpleXML and DOM objects just point into a libxml parse 
> > result.
> >
> > Then to add to existing functions requires changing an argument type from 
> > string to string|Buffer, rather than adding new arguments.
> >
> > No change to the type system needed, internally or externally, just some 
> > code to unwrap the pointer. But perhaps I'm being naive and 
> > oversimplifying, as I don't have a deep understanding of the engine.
> >
> >> I'm not entirely sure what the next step here should be.  Should I go 
> >> research the above, or go back and develop/test and then propose something 
> >> concrete in an OO direction and gather feedback at that point, or should 
> >> we hash it out a bit more here on the list to get a more specific 
> >> direction to go in?
> >
> > Well, those were just my thoughts; maybe someone else will come along 
> > shortly with a very different take.
> 
> I'm very late on this discussion, but I think it is an interesting
> topic, and maybe <https://github.com/cmb69/php-stringbuilder>, which I
> had written long ago just to check some assumptions, can serve as POC.
> It is certainly possible to have such a string buffer class without
> having to patch the engine; it could even be made available as PECL
> extension (first).
> 
> Note that this StringBuilder uses `smart_str`s[1] what might be a good
> idea or not.  But certainly you could use some other internal handling;
> interoperability with `zend_string`s[2] requires to copy the char arrays
> in most cases anyway, since these have a fixed length, and if these
> copies are reduced to a minimum (i.e. the new class has enough
> flexibility to work without casting to and from string), that should be
> bearable.
> 
> Not sure if that would work for the "gd imageexportpixels() and
> imageimportpixels()" RFC[3], but it might be worth investigating.
> 
> [1]
> <https://www.phpinternalsbook.com/php7/internal_types/strings/smart_str.html>
> [2]
> <https://www.phpinternalsbook.com/php7/internal_types/strings/zend_strings.html>
> [3] <https://wiki.php.net/rfc/gd_image_export_import_pixels>
> 
> Cheers,
> Christoph
> 

Huh, I am also very late and somewhat poignant, last weekend, I managed to 
refactor all zend_strings to contain a char* instead of char[1] and the char* 
pointed to the memory just after the pointer. It increased zend_string by a few 
bytes on a 64bit machine, but would allow for some nice optimizations, such as 
zend_strings sharing memory (effectively removing the need for the current 
interned strings implementation). I ended up ditching it because it would break 
literally every extension that does its own allocations instead of calling 
zend_string_alloc|init() and it was also hard to manage when copying strings, 
which also some core extensions do instead of calling core zend_string_* 
functions. Needless to say, "vanilla php" worked fine and all tests passed.

I did submit a small part of my refactoring here: 
https://github.com/php/php-src/pull/15054 but even something that simple didn't 
seem well received. So, I won't continue this approach.

But, fwiw, I wouldn't advise changing zend_strings too much, many extensions 
appear to do one of two things: their own allocations and/or their own copying 
and/or their own freeing.

— Rob

Reply via email to