Hi Derick, Thomas,
Op do 16 feb. 2023 om 08:57 schreef Derick Rethans <der...@php.net>: > > > https://wiki.php.net/rfc/unicode_text_processing > > And yes, that won't be as fast as just calling strtoupper. > > cheers > Derick > Looks great!!! Complex string manipulation inside an object will be faster then all copying variables around in memory, like Thomas kindly explained in his post. If I understand correctly.... And it would make php even more mature, gaining from more OOP. Op wo 15 feb. 2023 om 20:35 schreef Thomas Hruska <thru...@cubiclesoft.com>: > <......> Doing that operation one time is fast enough and not really a problem. > Doing it 1,000,000 times in a loop is where we end up constantly copying > memory around when we could potentially work on the same memory buffer > the entire time. We still might end up using the same memory buffers > over and over due to recycling them through the PHP memory pool, which > means the buffers might get to sit in the L1 or L2 cache in the CPU, but > it does leave some performance on the table because copying a buffer or > portions of it repeatedly can be an unnecessary operation. Buffers that > are larger than the CPU's cache line sizes are going to suffer the most > because there will be constant requests to main memory for the > information that the CPU needs to modify and will constantly flush the > cache lines and stall out while waiting for more data to arrive. That's > not exactly optimal/ideal. Modifying the same buffer inline will be > more likely stay in the L1 and L2 cache lines and therefore be much > closer to the CPU core, resulting in notably faster performance. > Pointers in C are much faster than copying memory. The problem is > exposing pointers to userland, especially in Internet-facing software. > Pointers are notoriously unsafe - just look at the zillion buffer > overflow vulnerabilities (CVEs) that are reported annually across all > software products. Copy-on-write, by comparison, is a much safer > operation at the cost of performance. However, pointers let us just > point at a substring or general chunk of memory instead of copying it, > which significantly reduces the overhead since pointers are simple > integer values that contain a memory address. And those values are > small enough to sit in CPU registers, which are blazing fast. CPUs only > have a handful of registers though because each register dramatically > increases the cost of the CPU die. So if we can just point at the memory we want to "extract" instead of actually copying the data into > its own string object, we can potentially save a ton of CPU cycles, > especially when working with data inside a loop. > > > Overall, I think substrings offer the most obvious/apparent area for > performance gains and probably have, implementation details aside, the > least amount of friction. But maybe we should consider the larger > ecosystem of string functions as well? Or should this just be a > possible longer term idea that requires more thought and research and > thus the scope should be limited and we put Lydia's idea under Future > Scope in the RFC? Other thoughts/comments? > > Added as Open Issue 10 to the RFC. Thank you for your input. > > Thomas Hruska > Thanks for your kind and extended explanation. I know a little about the memory allocations. But I am not sure about what to conclude from your explanation. If an object would take less copying around or not. This memory conversation brings up other old memories ☺... peek, pook, assembly etc 😍 Greetz, flexJoly (aka Lydia)