Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-31 Thread Pierre Joye
hello,

On Sun, Dec 31, 2023, 6:59 PM Rowan Tommins  wrote:

Then one of us is missing something very fundamental. As I understand it,
> Swoole's model is similar to that popularised by node.js: a single thread
> processes multiple incoming requests concurrently, using asynchronous I/O.


The nodejs curse yes, where async and co may actually slow down your whole
node.

 DB result
> 09 Request A formats and returns response
> 10 Request A complete
> 11 Request B resumed
> 12 Request B fornats and returns response
>

php handles this in threadsafe mode, like modphp f.e. It is why frankenphp
requires a TS build of php. Requests are handled by a thread pool, not in
single thread event loop which may block all requests.

best,
Pierre


Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-31 Thread Rowan Tommins
On 31 December 2023 08:31:16 GMT, "Kévin Dunglas"  wrote:
>This new function is intended for SAPIs. Swoole was given as an example of
>worker mode, but it isn't a SAPI. AFAIK, it doesn't use the SAPI
>infrastructure provided by PHP.
>The scope of my proposal is only to provide a new feature in the SAPI
>infrastructure to build worker modes to handle HTTP requests, not to deal
>with non-SAPI engines.

One of the advantages you suggested of your proposal is that users would have a 
consistent way to write worker scripts. To achieve that, you want a *design* 
that can be adopted by as many implementations as possible, regardless of how 
they implement it. Providing helper infrastructure for that design is a 
secondary concern - as you admit, the actual code you're proposing to add is 
quite short.


>That being said, I don't understand what would prevent Swoole from
>implementing the proposed API

Then one of us is missing something very fundamental. As I understand it, 
Swoole's model is similar to that popularised by node.js: a single thread 
processes multiple incoming requests concurrently, using asynchronous I/O. For 
instance, a thread might run the following:

01 Request A received
02 Request A input validated
03 Request A sends async query to DB
04 Request A hands control to event loop while it awaits result
05 Request B received
06 Request B sends async HTTP call to some API
07 Request B awaits result
08 Request A resumed with DB result
09 Request A formats and returns response
10 Request A complete
11 Request B resumed
12 Request B fornats and returns response

Each request has its own call stack, started by a different call to the 
registered event handler, but any global state is shared between them - there 
is no actual threading going on, so no partitioned memory.

If requests are communicated by setting up superglobals, that will happen at 
step 01 and again at step 05. If you try to read from them at step 09, you 
would see them populated with information about request B, but you're trying to 
handle request A.

It would be possible to work around that by placing large warnings to users not 
to read superglobals after any async call - basically forcing them to create 
scoped copies to pass around. But the worse problem is output: if step 09 and 
step 12 both just use "echo", how do you track which output needs to go to 
which network connection? You can't just set up an output buffer, because 
that's global state shared by both call stacks. You have to put *something* 
into the scope of the call stack - a callback to write output, an expected 
return value, etc.

Asynchronous code ends up somewhat resembling functional programming: 
everything you want to have side effects on needs to be passed around as 
parameters and return values, because the only thing isolated between requests 
is local variable scope.
 
Regards,

-- 
Rowan Tommins
[IMSoP]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-31 Thread Kévin Dunglas
On Sun, Dec 31, 2023 at 2:20 AM Rowan Tommins 
wrote:

> On 30 December 2023 19:48:39 GMT, Larry Garfield 
> wrote:
> >The Franken-model is closer to how PHP-FPM works today, which means that
> is easier to port existing code to, especially existing code that has lots
> of globals or hidden globals.  (Eg, Laravel.)  That may or may not make it
> the better model overall, I don't know, but it's the more-similar model.
>
> That's why I said earlier that it provides better backwards compatibility
> - existing code which directly uses PHP's current global state can more
> easily be run in a worker which populates that global state.
>
> However, the benefit is marginal, for two reasons. Firstly, because in
> practice a lot of applications avoid touching the global state outside of
> some request bootstrapping code anyway. The FrankenPHP example code and
> Laravel Octane both demonstrate this.
>
> Secondly, because in an environment that handles a single request at a
> time, the reverse is also possible: if the server passes request
> information directly to a callback, that callback can populate the
> superglobals as appropriate. The only caveat I can think of is input
> streams, since userland code can't reset and populate php://input, or
> repoint STDOUT.
>
> On the other hand, as soon as you have any form of concurrency, the two
> models are not interchangeable - it would make no sense for an asynchronous
> callback to read from or write to global state.
>
> And that's what I meant about FrankenPHP's API having poor forward
> compatibility - if you standardise on an API that populates global state,
> you close off any possibility of using that API in a concurrent
> environment. If you instead standardise on callbacks which hold request and
> response information in their own scope, you don't close anything off.
>
> If anything, calling this "forwards compatibility" is overly generous: the
> OP gave Swoole as an example of an existing worker environment, but I can't
> see any way that Swoole could implement an API that communicated request
> and response information via global state.
>
> Regards,
>
> --
> Rowan Tommins
> [IMSoP]
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: https://www.php.net/unsub.php


This new function is intended for SAPIs. Swoole was given as an example of
worker mode, but it isn't a SAPI. AFAIK, it doesn't use the SAPI
infrastructure provided by PHP.
The scope of my proposal is only to provide a new feature in the SAPI
infrastructure to build worker modes to handle HTTP requests, not to deal
with non-SAPI engines.
That being said, I don't understand what would prevent Swoole from
implementing the proposed API, or even to implement a userland
implementation of the proposed API using Swoole under the hood.
It seems doable to emulate the sequential request handling and to create an
adapter from their custom objects to superglobals and streams.

For WebSockets and WebTransports, the same considerations apply. The SAPI
API will have to be extended to deal with such low-level network layers,
worker mode or not. To me, this is very interesting (and needed) but should
be discussed in another RFC.

As pointed out by Crell, FrankenPHP (and similar theoretical solutions)
starts as many workers as needed. This can be a fixed set of workers, as in
FrankenPHP, or a dynamic number of workers, similar to traditional FPM
workers.

FrankenPHP uses threads to parallelize request handling (to start several
instances of the worker script in parallel). Other techniques could be
used, for instance, in the future, we could use goroutines (which use a mix
of system threads and async IO, and goroutines are handled in a single
system thread:
https://github.com/golang/go/blob/master/src/runtime/HACKING.md#gs-ms-ps)
instead of threads, by adding a new backend to TSRM.

The global state is never reset in the same worker context, it is preserved
across requests, except for superglobals and streams, which are updated
with the data of the request being handled.

Superglobals are the PHP way to expose CGI-like data. Adding support for
other ways to do it such as proposed by WSGI, and/or new objects and the
like could be interesting, but again this isn't the scope of this proposal
which is narrow, and tries to reuse the existing infrastructure as much as
possible. The proposal is simple enough to support new ways if introduced
at some point in PHP, and the Symfony Runtime and Laravel Octane libraries
prove that it's possible to implement more advanced data structures
user-land on top of the existing superglobals infrastructure.

Regarding the infinite loop, we could indeed remove it using a few lines of
code. I hesitated to do that initially, but the loop gives more flexibility
by allowing the implementation of many features in user-land (like
restarting the worker after a fixed number of requests, when the memory
reaches a certain level, etc). Without this loop, all these