On Sat, Jul 27, 2024, at 15:26, Christoph M. Becker wrote: > On 15.02.2023 at 06:18, Rowan Tommins wrote: > > > On 15 February 2023 02:35:42 GMT, Thomas Hruska <thru...@cubiclesoft.com> > > wrote: > > > >> On 2/14/2023 2:02 PM, Rowan Tommins wrote: > >> > >> I thought about that but didn't know how well it would be received nor, > >> perhaps more importantly, the direction it should take (i.e. a formal Zend > >> type in the engine, extending the existing zend_string type, a class, some > >> combination, or something else entirely). All of the more advanced > >> options I came up with would have required some code changes to the PHP > >> source itself with a new data type being the most involved and probably > >> the most controversial. > > > > My instinct was that it could just be a built-in class, with an internal > > pointer to a zend_string that's completely invisible to userland. Something > > like how the SimpleXML and DOM objects just point into a libxml parse > > result. > > > > Then to add to existing functions requires changing an argument type from > > string to string|Buffer, rather than adding new arguments. > > > > No change to the type system needed, internally or externally, just some > > code to unwrap the pointer. But perhaps I'm being naive and > > oversimplifying, as I don't have a deep understanding of the engine. > > > >> I'm not entirely sure what the next step here should be. Should I go > >> research the above, or go back and develop/test and then propose something > >> concrete in an OO direction and gather feedback at that point, or should > >> we hash it out a bit more here on the list to get a more specific > >> direction to go in? > > > > Well, those were just my thoughts; maybe someone else will come along > > shortly with a very different take. > > I'm very late on this discussion, but I think it is an interesting > topic, and maybe <https://github.com/cmb69/php-stringbuilder>, which I > had written long ago just to check some assumptions, can serve as POC. > It is certainly possible to have such a string buffer class without > having to patch the engine; it could even be made available as PECL > extension (first). > > Note that this StringBuilder uses `smart_str`s[1] what might be a good > idea or not. But certainly you could use some other internal handling; > interoperability with `zend_string`s[2] requires to copy the char arrays > in most cases anyway, since these have a fixed length, and if these > copies are reduced to a minimum (i.e. the new class has enough > flexibility to work without casting to and from string), that should be > bearable. > > Not sure if that would work for the "gd imageexportpixels() and > imageimportpixels()" RFC[3], but it might be worth investigating. > > [1] > <https://www.phpinternalsbook.com/php7/internal_types/strings/smart_str.html> > [2] > <https://www.phpinternalsbook.com/php7/internal_types/strings/zend_strings.html> > [3] <https://wiki.php.net/rfc/gd_image_export_import_pixels> > > Cheers, > Christoph >
Huh, I am also very late and somewhat poignant, last weekend, I managed to refactor all zend_strings to contain a char* instead of char[1] and the char* pointed to the memory just after the pointer. It increased zend_string by a few bytes on a 64bit machine, but would allow for some nice optimizations, such as zend_strings sharing memory (effectively removing the need for the current interned strings implementation). I ended up ditching it because it would break literally every extension that does its own allocations instead of calling zend_string_alloc|init() and it was also hard to manage when copying strings, which also some core extensions do instead of calling core zend_string_* functions. Needless to say, "vanilla php" worked fine and all tests passed. I did submit a small part of my refactoring here: https://github.com/php/php-src/pull/15054 but even something that simple didn't seem well received. So, I won't continue this approach. But, fwiw, I wouldn't advise changing zend_strings too much, many extensions appear to do one of two things: their own allocations and/or their own copying and/or their own freeing. — Rob