Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-30 Thread Rowan Tommins
On 30 December 2023 19:48:39 GMT, Larry Garfield  wrote:
>The Franken-model is closer to how PHP-FPM works today, which means that is 
>easier to port existing code to, especially existing code that has lots of 
>globals or hidden globals.  (Eg, Laravel.)  That may or may not make it the 
>better model overall, I don't know, but it's the more-similar model.

That's why I said earlier that it provides better backwards compatibility - 
existing code which directly uses PHP's current global state can more easily be 
run in a worker which populates that global state.

However, the benefit is marginal, for two reasons. Firstly, because in practice 
a lot of applications avoid touching the global state outside of some request 
bootstrapping code anyway. The FrankenPHP example code and Laravel Octane both 
demonstrate this.

Secondly, because in an environment that handles a single request at a time, 
the reverse is also possible: if the server passes request information directly 
to a callback, that callback can populate the superglobals as appropriate. The 
only caveat I can think of is input streams, since userland code can't reset 
and populate php://input, or repoint STDOUT.

On the other hand, as soon as you have any form of concurrency, the two models 
are not interchangeable - it would make no sense for an asynchronous callback 
to read from or write to global state.

And that's what I meant about FrankenPHP's API having poor forward 
compatibility - if you standardise on an API that populates global state, you 
close off any possibility of using that API in a concurrent environment. If you 
instead standardise on callbacks which hold request and response information in 
their own scope, you don't close anything off.

If anything, calling this "forwards compatibility" is overly generous: the OP 
gave Swoole as an example of an existing worker environment, but I can't see 
any way that Swoole could implement an API that communicated request and 
response information via global state.

Regards,

-- 
Rowan Tommins
[IMSoP]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-30 Thread Michał Marcin Brzuchalski
Hi Larry,

sob., 30 gru 2023 o 20:49 Larry Garfield 
napisał(a):

> On Sat, Dec 30, 2023, at 4:53 AM, Rowan Tommins wrote:
> > On 30 December 2023 09:59:07 GMT, Robert Landers
> >  wrote:
> >>For this to happen in PHP Core, there would need to be request objects
> >>instead of a global state.
> >
> > Again, the representation as objects isn't a key requirement. Python's
> > WSGI spec simply has a dictionary (read: associative array) of the
> > environment based on CGI. The application might well turn that into a
> > more powerful object, but standardisation of such wasn't considered a
> > pre-requisite, and would actually have hampered ASGI, where not all
> > events represent an HTTP request.
> >
> > The key requirement is that you have some way of passing the current
> > request and response around as scoped variables, not global state.
> > That's essential for any kind of concurrent run-time (async,
> > thread-aware, etc).
> >
> > An event / subscriber model fits well with that: the local scope for
> > each request is set up by an invocation of the callback with defined
> > parameters and return value.
> >
> > Funnily enough, the example of a worker script for FrankenPHP does both
> > things: it sends each request to the same application "handle"
> > callback, passing in the super-global arrays as parameters to be used
> > as non-global state. https://frankenphp.dev/docs/worker/#custom-apps So
> > really all I'm arguing is that a few more lines of that PHP example be
> > moved into the C implementation, so that the user only needs to provide
> > that inner callable, not the outer while loop.
>
> So you're suggesting something like:
>
> $app->initializeStuffHowever();
> set_event_handler(Closure $handler);
> // Script blocks here until a sigkill is received, or something.
>
> I think there's an important distinction that is getting missed in the
> above discussion, beyond the push-vs-pull question.  FrankenPHP, as I
> understand it, pre-boots multiple worker processes, keeps them in memory,
> and then handles each request in its own process.  Swoole,
> Amp/React/Revolt, and friends have only a single process running at all,
> and make use of async to simulate multiple simultaneous requests, a la
> NodeJs.  That means mutable global variables in the FrankenPHP model still
> won't leak between parallel requests, whereas they absolutely would/do in a
> Swole/Revolt world.
>
> I'm not going to call one of those better or worse (I don't have enough
> experience with either to say), but they are different beasts for which
> first class support would be different SAPIs either way.  They're not
> mutually exclusive thanks to Fibers (which mean you don't need the entire
> call stack to be async), but you would want to pick one or the other as
> primary runner mode of an application.  Let's keep that in mind when making
> comparisons.
>
> The Franken-model is closer to how PHP-FPM works today, which means that
> is easier to port existing code to, especially existing code that has lots
> of globals or hidden globals.  (Eg, Laravel.)  That may or may not make it
> the better model overall, I don't know, but it's the more-similar model.
>
> All that said, the idea of allowing a "persistent HTTP handler process"
> SAPI, "persistent Queue handler process" SAPI, and "persistent cron handler
> process" SAPI (or whatever combination of persistent processes) to all run
> side by side with the same code base but different entry point scripts
> is...  Hot.  If we can do something that would enable that kind of runtime
> model, I am very much here for that.
>

What you wrote sounds like some good points (as usual).
I'm not an expert (yet!) but was playing around with some callable trying
to mimic ASGI
https://github.com/brzuchal/asgi-playground/blob/main/app.php#L26-L37

What I think currently (maybe too hurry, but...) is that this kind of
approach is flexible enough to handle in easy way many SAPIs which identify
to app their capabilities,
and the app decides how and what can handle `$scope['type']` in the example
code.

I know there is a Runtime library, that tries to integrate
Symfony/Laaravel to many SAPIs, but as far as I understood the discussion
went to figure out if there is some kind of standard approach that could be
shaped under the PHP umbrella.

Maybe this is just a temporary fascination about ASGI solution, could be.
If this is not in the scope of interest of anyone then forgive me, I won't
bother anymore.

Cheers,
--
Michał Marcin Brzuchalski


Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-30 Thread Larry Garfield
On Sat, Dec 30, 2023, at 4:53 AM, Rowan Tommins wrote:
> On 30 December 2023 09:59:07 GMT, Robert Landers 
>  wrote:
>>For this to happen in PHP Core, there would need to be request objects
>>instead of a global state.
>
> Again, the representation as objects isn't a key requirement. Python's 
> WSGI spec simply has a dictionary (read: associative array) of the 
> environment based on CGI. The application might well turn that into a 
> more powerful object, but standardisation of such wasn't considered a 
> pre-requisite, and would actually have hampered ASGI, where not all 
> events represent an HTTP request.
>
> The key requirement is that you have some way of passing the current 
> request and response around as scoped variables, not global state. 
> That's essential for any kind of concurrent run-time (async, 
> thread-aware, etc).
>
> An event / subscriber model fits well with that: the local scope for 
> each request is set up by an invocation of the callback with defined 
> parameters and return value.
>
> Funnily enough, the example of a worker script for FrankenPHP does both 
> things: it sends each request to the same application "handle" 
> callback, passing in the super-global arrays as parameters to be used 
> as non-global state. https://frankenphp.dev/docs/worker/#custom-apps So 
> really all I'm arguing is that a few more lines of that PHP example be 
> moved into the C implementation, so that the user only needs to provide 
> that inner callable, not the outer while loop.

So you're suggesting something like:

$app->initializeStuffHowever();
set_event_handler(Closure $handler);
// Script blocks here until a sigkill is received, or something.

I think there's an important distinction that is getting missed in the above 
discussion, beyond the push-vs-pull question.  FrankenPHP, as I understand it, 
pre-boots multiple worker processes, keeps them in memory, and then handles 
each request in its own process.  Swoole, Amp/React/Revolt, and friends have 
only a single process running at all, and make use of async to simulate 
multiple simultaneous requests, a la NodeJs.  That means mutable global 
variables in the FrankenPHP model still won't leak between parallel requests, 
whereas they absolutely would/do in a Swole/Revolt world.

I'm not going to call one of those better or worse (I don't have enough 
experience with either to say), but they are different beasts for which first 
class support would be different SAPIs either way.  They're not mutually 
exclusive thanks to Fibers (which mean you don't need the entire call stack to 
be async), but you would want to pick one or the other as primary runner mode 
of an application.  Let's keep that in mind when making comparisons.

The Franken-model is closer to how PHP-FPM works today, which means that is 
easier to port existing code to, especially existing code that has lots of 
globals or hidden globals.  (Eg, Laravel.)  That may or may not make it the 
better model overall, I don't know, but it's the more-similar model.

All that said, the idea of allowing a "persistent HTTP handler process" SAPI, 
"persistent Queue handler process" SAPI, and "persistent cron handler process" 
SAPI (or whatever combination of persistent processes) to all run side by side 
with the same code base but different entry point scripts is...  Hot.  If we 
can do something that would enable that kind of runtime model, I am very much 
here for that.

--Larry Garfield

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-30 Thread Niels Dossche
Hi Robert

On 30/12/2023 10:25, Robert Landers wrote:
> Hi Niels,
> 
>> They are indeed going to be very similar, but at least having better return 
>> types would be good to give one particular example.
>> e.g. we currently have a lot of methods that can return an object or false. 
>> The current living DOM spec always throws exceptions instead of returning 
>> false on error which is a much cleaner API.
>> Furthermore, we have the DOMNameSpaceNode that can be returned by some 
>> methods and has been a point of confusion for static analysis tools (I did a 
>> PR on psalm to fix one of those issues).
>> That node type won't be special cased in the new classes API so the 
>> (inconsistent use of the) union of DOMAttr|DOMNameSpaceNode will go away.
> 
> Actually, I'm not sure it is supposed to be throwing exceptions (if we
> look at https://html.spec.whatwg.org/multipage/parsing.html#parse-errors);
> in fact, I'd argue there are three different ways to handle errors
> (from some experience in writing a parser from scratch):

I'm not talking about handling parser errors.
Parser errors indeed should not be handled via exceptions, they emit a warning 
and continue with error recovery as described in spec.
This was part of my HTML 5 RFC: 
https://wiki.php.net/rfc/domdocument_html5_parser

I'm talking about methods like createElement, setAttributeNode, ... that can 
fail due to errors.
In DOM 3 (and therefore PHP too), there was a "strictErrorChecking" boolean 
option.
When enabled, exceptions were thrown when constraints were not met of such 
methods.
When disabled, no exception is thrown but a warning is emit and false is 
returned instead.
The DOM living spec no longer has that option and always uses exceptions.

In the new classes I would also only use exceptions and not include the 
strictErrorChecking option, as spec demands.
This cleans up return types.

For example: $doc->createElement("") should throw.
Or $element->setAttributeNode($attr) should throw when $attr is already used by 
another element.
Etc.

> 
> 1. Acting as a user-agent: in this case, errors should be handled as
> described in the spec for a user-agent, e.g., switching to Text-Mode
> in some cases and gobbling up the rest of the document.

The HTML 5 RFC follows the spec error recovery rules for user agents.

> 
> 2. Acting as a conformance checker: in this case, a list of errors
> should be available to the programmer instead of bailing when parsing
> (e.g., not switching to Text-Mode, but trying to continue parsing the
> document, as described in the parser spec for conformance checking).
> 
> 3. Acting as a document builder: Putting the document into an invalid
> state should emit at least a warning. However, it's likely better to
> let the user-agent handle the invalid DOM (as this is probably more
> forward-thinking for new HTML that currently doesn't exist). This is
> actually one of the biggest draw-backs to the current implementation
> as it requires a number of "hacks" to build valid HTML.

Kind regards
Niels

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-30 Thread Rowan Tommins
On 30 December 2023 09:59:07 GMT, Robert Landers  
wrote:
>For this to happen in PHP Core, there would need to be request objects
>instead of a global state.

Again, the representation as objects isn't a key requirement. Python's WSGI 
spec simply has a dictionary (read: associative array) of the environment based 
on CGI. The application might well turn that into a more powerful object, but 
standardisation of such wasn't considered a pre-requisite, and would actually 
have hampered ASGI, where not all events represent an HTTP request.

The key requirement is that you have some way of passing the current request 
and response around as scoped variables, not global state. That's essential for 
any kind of concurrent run-time (async, thread-aware, etc).

An event / subscriber model fits well with that: the local scope for each 
request is set up by an invocation of the callback with defined parameters and 
return value.

Funnily enough, the example of a worker script for FrankenPHP does both things: 
it sends each request to the same application "handle" callback, passing in the 
super-global arrays as parameters to be used as non-global state. 
https://frankenphp.dev/docs/worker/#custom-apps So really all I'm arguing is 
that a few more lines of that PHP example be moved into the C implementation, 
so that the user only needs to provide that inner callable, not the outer while 
loop.

Regards,

-- 
Rowan Tommins
[IMSoP]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-30 Thread Michał Marcin Brzuchalski
Hi Robert,

sob., 30 gru 2023, 10:59 użytkownik Robert Landers 
napisał:

> > > - FrankenPHP expects the user to manage the main event loop ...
> > >
> > >
> > > This isn't exact. FrankenPHP does manage the event loop (the Go
> > > runtime manages it - through a channel - under the hood).
> >
> >
> > Perhaps "event loop" was the wrong term; what I was highlighting is that
> > to use FrankenPHP or RoadRunner, you have to write a while loop, which
> > explicitly handles one request at a time. In Swoole, there is no such
> > loop: you register event handlers and then call $server->run() once.
> > Similarly, WSGI mandates that the server "invokes the application
> > callable once for each request it receives from an HTTP client".
> >
> > It's a distinction of pull/poll (the application must actively block
> > until next request) vs push/subscribe (the application is passively
> > invoked whenever needed).
>
> I think these models have different capabilities: A pull/poll model is
> quite simple, while a subscription model is usually more complex.
>
> With something simple like in FrankenPHP, creating a Queue SAPI, a
> WebSocket SAPI, etc isn't far off, where someone writes some PHP to
> consume a queue or websocket connections.
>
> > > I already replied to Crell about that. It will totally possible to
> > > expose more complex HTTP message objects in the future,
> > > but PHP currently lacks such objects. The only things we have are
> > > superglobals (which are more or less similar to CGI variables, as done
> > > in WSGI) and streams. It's why we're using them.
> >
> >
> > The use of objects vs arrays wasn't the main difference I was trying to
> > highlight there, but rather the overall API of how information gets into
> > and out of the application. FrankenPHP is the only server listed which
> > needs to reset global state on each request, because the others
> > (including Python WSGI and ASGI) use non-global variables for both input
> > and output.
> >
> > I notice that the Laravel Octane adaptor for FrankenPHP takes that
> > global state and immediately converts it into non-global variables for
> > consumption by the application.
>
> For this to happen in PHP Core, there would need to be request objects
> instead of a global state. If an RFC implementing PSR
> requests/responses in Core is a pre-requisite for enabling what we're
> discussing here, I'd personally be all for that (as would a very large
> chunk of the PHP community, IMHO). I personally think this is a
> chicken/egg type of problem though. It doesn't make sense to have
> request/response objects right now, and I get the feeling that people
> would only support worker mode primitives if there were request
> objects... so, it might make sense to build a v1 of the worker mode
> primitives and then iterate towards request objects, because then
> there would be an actual need for them.
>

That is certainly not true. Looking at WSGI or ASGI there is no need for
request response objects. These can be provided in userland which gives
more flexibility cause of different rules governing over bc break policy in
PHP core.

Name one true argument to convince me in this topic and I may change my
mind.
For the years I had the same impression but on low level the primitives are
more flexible and we all know that.

Cheers,
Michał Marcin Brzuchalski

>


Re: [PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-30 Thread Sebastian Bergmann

Am 29.12.2023 um 17:58 schrieb Larry Garfield:

I am also on team "yes, let's just do it right."  If that means the new classes 
are only 99% drop ins for the old ones, I'm OK with that.  People can switch over when 
they're ready and do all the clean up at once.


+1

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-30 Thread Robert Landers
> > - FrankenPHP expects the user to manage the main event loop ...
> >
> >
> > This isn't exact. FrankenPHP does manage the event loop (the Go
> > runtime manages it - through a channel - under the hood).
>
>
> Perhaps "event loop" was the wrong term; what I was highlighting is that
> to use FrankenPHP or RoadRunner, you have to write a while loop, which
> explicitly handles one request at a time. In Swoole, there is no such
> loop: you register event handlers and then call $server->run() once.
> Similarly, WSGI mandates that the server "invokes the application
> callable once for each request it receives from an HTTP client".
>
> It's a distinction of pull/poll (the application must actively block
> until next request) vs push/subscribe (the application is passively
> invoked whenever needed).

I think these models have different capabilities: A pull/poll model is
quite simple, while a subscription model is usually more complex.

With something simple like in FrankenPHP, creating a Queue SAPI, a
WebSocket SAPI, etc isn't far off, where someone writes some PHP to
consume a queue or websocket connections.

> > I already replied to Crell about that. It will totally possible to
> > expose more complex HTTP message objects in the future,
> > but PHP currently lacks such objects. The only things we have are
> > superglobals (which are more or less similar to CGI variables, as done
> > in WSGI) and streams. It's why we're using them.
>
>
> The use of objects vs arrays wasn't the main difference I was trying to
> highlight there, but rather the overall API of how information gets into
> and out of the application. FrankenPHP is the only server listed which
> needs to reset global state on each request, because the others
> (including Python WSGI and ASGI) use non-global variables for both input
> and output.
>
> I notice that the Laravel Octane adaptor for FrankenPHP takes that
> global state and immediately converts it into non-global variables for
> consumption by the application.

For this to happen in PHP Core, there would need to be request objects
instead of a global state. If an RFC implementing PSR
requests/responses in Core is a pre-requisite for enabling what we're
discussing here, I'd personally be all for that (as would a very large
chunk of the PHP community, IMHO). I personally think this is a
chicken/egg type of problem though. It doesn't make sense to have
request/response objects right now, and I get the feeling that people
would only support worker mode primitives if there were request
objects... so, it might make sense to build a v1 of the worker mode
primitives and then iterate towards request objects, because then
there would be an actual need for them.

Robert Landers
Software Engineer
Utrecht NL

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-30 Thread Robert Landers
Hi Niels,

> They are indeed going to be very similar, but at least having better return 
> types would be good to give one particular example.
> e.g. we currently have a lot of methods that can return an object or false. 
> The current living DOM spec always throws exceptions instead of returning 
> false on error which is a much cleaner API.
> Furthermore, we have the DOMNameSpaceNode that can be returned by some 
> methods and has been a point of confusion for static analysis tools (I did a 
> PR on psalm to fix one of those issues).
> That node type won't be special cased in the new classes API so the 
> (inconsistent use of the) union of DOMAttr|DOMNameSpaceNode will go away.

Actually, I'm not sure it is supposed to be throwing exceptions (if we
look at https://html.spec.whatwg.org/multipage/parsing.html#parse-errors);
in fact, I'd argue there are three different ways to handle errors
(from some experience in writing a parser from scratch):

1. Acting as a user-agent: in this case, errors should be handled as
described in the spec for a user-agent, e.g., switching to Text-Mode
in some cases and gobbling up the rest of the document.

2. Acting as a conformance checker: in this case, a list of errors
should be available to the programmer instead of bailing when parsing
(e.g., not switching to Text-Mode, but trying to continue parsing the
document, as described in the parser spec for conformance checking).

3. Acting as a document builder: Putting the document into an invalid
state should emit at least a warning. However, it's likely better to
let the user-agent handle the invalid DOM (as this is probably more
forward-thinking for new HTML that currently doesn't exist). This is
actually one of the biggest draw-backs to the current implementation
as it requires a number of "hacks" to build valid HTML.

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-30 Thread Michał Marcin Brzuchalski
Hi Rowan,

pt., 29 gru 2023 o 23:56 Rowan Tommins  napisał(a):

> On 29/12/2023 21:14, Kévin Dunglas wrote:
> ...
> The use of objects vs arrays wasn't the main difference I was trying to
> highlight there, but rather the overall API of how information gets into
> and out of the application. FrankenPHP is the only server listed which
> needs to reset global state on each request, because the others
> (including Python WSGI and ASGI) use non-global variables for both input
> and output.
>

I wasn't aware of ASGI, in the past I read about WSGI and noticed a PHP
connector allowing the PHP app to run inside the WSGI server.
I read most of the spec
https://asgi.readthedocs.io/en/latest/specs/index.html yesterday and it
sounds like a really solid solution.
Personally, I'd love to see something similar for PHP.
It'd clearly be something different from the usual PHP app where global
$_GET|POST variables carry the HTTP request input.
Solution taken by Python in fact is about returning a callable fulfilling a
specific signature no matter if this is a simple function, closure or
Object implementing __invoke function - and this gives much flexibility.
I believe that considering the fact that ASGI provides an API for HTTP
interaction including WebSockets that could only benefit to PHP ecosystem.

In the past, I was thinking about something similar to adopting WSGI but
was not aware of ASGI.

Cheers,
Michał Marcin Brzuchalski