Re: [PHP-DEV] Zephir, and other tangents

Mike Schinkel Wed, 11 Sep 2024 14:34:17 -0700

> On Sep 11, 2024, at 4:55 PM, Rowan Tommins [IMSoP] <imsop....@rwec.co.uk> 
> wrote:
> On 11 September 2024 20:12:53 BST, Mike Schinkel <m...@newclarity.net> wrote:
>>> It also risks conflicting with a future language feature that overlaps, as 
>>> happened with all native functions marked as accepting string automatically 
>>> coercing nulls, but all userland ones rejecting it. Deprecating that 
>>> difference has caused a lot of friction.
>> 
>> That is a little different in that it was a behavior that occurred in both 
>> core and userland whereas only allowing operator overloading in core would 
>> mean there would be not userland differences that could conflict.
> 
> Historically, that's what we had for scalar parameters. All the way back in 
> PHP 4 (I think), the engine had a function called "zend_parse_parameters" 
> (ZPP), which took the PHP values a user provided, and either converted them 
> to the desired C type, or rejected them. In effect, it allowed functions 
> defined in extensions to declare scalar typed parameters.
> 
> Then in PHP 7, we added scalar type declarations for parameters in userland 
> functions, and had to work out how they fitted with those internal functions. 
> Part of the motivation for the strict_types toggle was to manage their 
> behaviour; and userland functions require explicit nullable types, whereas 
> ZPP historically coerced nulls regardless.
> 
> Anything we let extensions do could end up with the same dilemma later: do we 
> match userland to existing extension behaviour, change extension behaviour, 
> or live with an awkward inconsistency?


There are several levels of hypotheticals in your stated concern, so yes, if we 
make all regrettable decisions then we possibly have a problem.  But if we 
allow several levels of hypotheticals to be criteria for every decision we make 
then we can probably find an argument against any improvement.  

That said, not doing this seems to matter more to you than doing it matters to 
me, so I relent.

>> WebAssembly has a deny-by-default design so could be something to seriously 
>> consider for extensibility in PHP. Implementations start with a full 
>> sandbox[2] and only add what they need to avoid those kinds of concerns. 
> 
> The problem is that second part: in order to be useful for writing a PHP 
> extension, what would we need to let through the sandbox?

Initially, only scalar parameters and scalar return values. (Stop and 
contemplate this for a moment, please.)

Then we can consider "letting through" more in future RFCs as we all learn more 
about using WASM and so that each thing we add can be well considered.

>> I think that actually supports what I was saying; people would gravitate to 
>> only doing in an extension what they cannot do in PHP itself, and over time 
>> if PHP itself improves there is reason to migrate more code to PHP.  
>> 
>> But there can still be reasons to not allow some thing in userland. Some 
>> things like __toArray.
> 
> I think there's ideas pulling in opposite directions here: on the one hand, 
> using the difficulty of building extensions as a deliberate "speed bump" to 
> avoid people using features "badly"; but on the other hand, wanting to reduce 
> the difficulty of building extensions.

That is one way to look at it, but not the only way and not the way I was 
looking at it.

Let's say the effort to write something in PHP is 1 and the effort to write in 
C is 100 using arbitrary units where 100 means almost nobody ever does it. My 
argument considers 100 being too high a bar, but maybe 50 or 25 is low enough 
so as to not open the floodgates but still make it easy enough that people 
actually do it occasionally as clearly even 25 is many times harder than 1.

And no need to bikeshed those numbers. I was using arbitrary numbers to try and 
illustrate a concept, nothing more.

BTW, I was not looking at it as a "speed bump," I was looking at it as the fact 
that features in core are written in C, not PHP.

> I think the latter is a more noble goal, and one way to help is to make it 
> less *necessary* to build extensions, by adding to the core language the 
> things you currently need extensions to do. Things like efficient string 
> buffers and binary stream manipulation, or attributes and magic methods to 
> override object behaviour.

That is one way to consider it, and one I would definitely like to see. 

But I still do not think you can optimize out all need for low level languages 
by adding userland features.  

For example, how do we add a feature that allows for efficient looping? 
Certainly we can optimize the loops themselves; we already have with functions 
like `array_map()`, but there would still be need to context switch between C 
and PHP callable and zvals. That context switching is probably at least an 
order of magnitude more time consuming than a pure single loop iteration and 
memory location update in WASM, C, etc. Or do you envision some way to optimize 
running code in loops, and storage and access of data in byte arrays for which 
I am not seeing?

To summarize, I think PHP would benefit from:

1. Adding WASM for simple low-level extensibility that could run on shared 
hosts for things that are just not possible in PHP as described a few 
paragraphs prior, and where we could enhance functionality over time,

2. Constantly improving PHP the language, which is what you are solely 
advocating for over extensibility,

3. Enabling some parts of core — definitely some newer functions and classes, 
and maybe some existing ones — to be implemented in userland PHP that would 
ship embedded in PHP core to make maintenance easier and enable more people to 
potentially contribute to core, 

4. Allow operator overloading for classes in core where the community agrees it 
makes sense, but if that is a bridge too far then I would be happy if we never 
allow operator overloading at all, and 

5. Typedefs FTW!

-Mike

Re: [PHP-DEV] Zephir, and other tangents

Reply via email to