Re: [PHP-DEV] Switching max_execution_time from CPU time to wall-clock time and from SIGPROF to SIGALRM

2024-05-21 Thread Kévin Dunglas
Hello,

I'm in favor of merging Arnaud's patch for macOS while waiting for a better
solution like relying on Grand Central Dispatch or another non-signal-based
solution (https://github.com/php/php-src/pull/13468), which would allow
max_execution_time to work with ZTS builds on mac as well.

I'm also in favor of using wall-clock time wherever possible (disclaimer:
I'm the original author of this feature for Linux and FreeBSD).

Best,


[PHP-DEV] Re: [proposal] max_execution_time to a negative number

2024-04-11 Thread Kévin Dunglas
According to "man 2 setitimer", the same error should happen on Linux even
without zend_max_execution_timer:
https://github.com/php/php-src/blob/2079da0158bc91fff4edd85ac66c89b40c4faf3a/Zend/zend_execute_API.c#L1566

A C error will also occur if the value is superior to 999,999,999.

We should at least prevent the C error in such cases. I proposed a patch
normalizing these values to 0: https://github.com/php/php-src/pull/13942
It's still better than the current situation, and can still be considered
as "undefined" until the RFC is voted.


Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)

2024-04-04 Thread Kévin Dunglas
Data classes will be a very useful addition to "API Platform".

API Platform is a "resource-oriented" framework that strongly encourages
the use of "data-only" classes:
we use PHP classes both as a specification language to document the public
shape of web APIs (like an OpenAPI specification, but written in PHP
instead of JSON or YAML),
and as Data Transfer Objects containing the data to be serialized into JSON
(read), or the JSON payload deserialized into PHP objects (write).

Being able to encourage users to use structs (that's what we already call
this type of behavior-less class in our workshops) for these objects will
help us a lot.

Kévin


Re: [PHP-DEV] php-src docs

2024-02-11 Thread Kévin Dunglas
I strongly support this initiative. 

When I started writing SAPI (even though I already had some experience with the 
PHP code base), I spent a lot of time reading scattered articles on the 
subject, many of which were incomplete or outdated. Having a centralized place 
to search and contribute would make things a lot easier.
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] Re: RFC proposal: worker mode primitives for SAPIs

2024-01-04 Thread Kévin Dunglas


> Le 4 janv. 2024 à 18:21, Joanhey  a écrit :
> 
> Hi,
> 
> I like it for start a discussion, than it's necessary.
> But we need to see the big picture.
> 
> The CLI-SAPI is the poor brother in PHP (contrary to other languages), but 
> that is another discussion than I'll try to open later.
> 
> Create a Worker-SAPI?
> 
> First any CLI worker can't access the SAPI. So they don't have any benefit.
> So Amp, React, Revolt, Workerman, Adapterman, Symfony runtime,... can't 
> access the internal SAPI functions. Each need to recreate in user land PHP 
> code for functions that already exist in PHP sapis.
> Kudos, for the RFC RFC1867 from Ilija. But we need to go farther.
> It isn't possible use header functions :(, 
> https://github.com/php/php-src/issues/12304
> 
> 
> Later we have SAPIs than use PHP embed (really easy to use :)) or in a 
> similar way.
> 
> Here we find 2 ways: forks or threads!!
> With forks we can use Super-Globals, with threads it's impossible, and for 
> that they need to encapsulate it in Request/Response objects.
> 
> How we'll join both situations. Here start the discussion.
> 
> Forks:
> Frankenphp, RoadRunner, Ngx-php (the fastest PHP runtime),...
> Nginx Unit still use a shared nothing approach, but it's really easy to have 
> both.
> 
> Threads:
> Swoole, OpenSwoole, Swoow,... in that situation the super globals are NOT 
> possible.
> 
> Here some frameworks permit use both (forks or threads) depending on the 
> master event loop that we choose. But they need to force all to the threads 
> way to have a unified interface.
> 
> We are talking about the main loop, because inside we can use any thread 
> system.
> 
> Thanks to all, and to Kevin to start the discussion.
> 
> PD: Actually any new PHP SAPI need to be added to the php-src to have OPCache 
> enabled. Nginx Unit and other still use cli-server SAPI to have it. That need 
> to be changed, so any SAPI can call it, without register.
> 
> Regards
> Joan Miquel

Thanks for the summary!

For the record, FrankenPHP and NGINX Unit use threads, not forks (and recommend 
ZTS PHP builds). And as far as I understand, Swoole etc use a reactor pattern 
and non-blocking IOs, not threads. Also Symfony Runtime isn’t an engine but a 
library with adapters for FrankenPHP, RoadRunner, Bref etc.

The main SAPI using libphp is the embed SAPI, but some non-core SAPIs including 
FrankenPHP, NGINX Unit and uWSGI use libphp with their own SAPIs. The confusing 
part is that you need to enable the embed SAPI through the configure options to 
build libphp, even if the embed SAPI itself isn’t used.

Best regards,

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-31 Thread Kévin Dunglas
On Sun, Dec 31, 2023 at 2:20 AM Rowan Tommins 
wrote:

> On 30 December 2023 19:48:39 GMT, Larry Garfield 
> wrote:
> >The Franken-model is closer to how PHP-FPM works today, which means that
> is easier to port existing code to, especially existing code that has lots
> of globals or hidden globals.  (Eg, Laravel.)  That may or may not make it
> the better model overall, I don't know, but it's the more-similar model.
>
> That's why I said earlier that it provides better backwards compatibility
> - existing code which directly uses PHP's current global state can more
> easily be run in a worker which populates that global state.
>
> However, the benefit is marginal, for two reasons. Firstly, because in
> practice a lot of applications avoid touching the global state outside of
> some request bootstrapping code anyway. The FrankenPHP example code and
> Laravel Octane both demonstrate this.
>
> Secondly, because in an environment that handles a single request at a
> time, the reverse is also possible: if the server passes request
> information directly to a callback, that callback can populate the
> superglobals as appropriate. The only caveat I can think of is input
> streams, since userland code can't reset and populate php://input, or
> repoint STDOUT.
>
> On the other hand, as soon as you have any form of concurrency, the two
> models are not interchangeable - it would make no sense for an asynchronous
> callback to read from or write to global state.
>
> And that's what I meant about FrankenPHP's API having poor forward
> compatibility - if you standardise on an API that populates global state,
> you close off any possibility of using that API in a concurrent
> environment. If you instead standardise on callbacks which hold request and
> response information in their own scope, you don't close anything off.
>
> If anything, calling this "forwards compatibility" is overly generous: the
> OP gave Swoole as an example of an existing worker environment, but I can't
> see any way that Swoole could implement an API that communicated request
> and response information via global state.
>
> Regards,
>
> --
> Rowan Tommins
> [IMSoP]
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: https://www.php.net/unsub.php


This new function is intended for SAPIs. Swoole was given as an example of
worker mode, but it isn't a SAPI. AFAIK, it doesn't use the SAPI
infrastructure provided by PHP.
The scope of my proposal is only to provide a new feature in the SAPI
infrastructure to build worker modes to handle HTTP requests, not to deal
with non-SAPI engines.
That being said, I don't understand what would prevent Swoole from
implementing the proposed API, or even to implement a userland
implementation of the proposed API using Swoole under the hood.
It seems doable to emulate the sequential request handling and to create an
adapter from their custom objects to superglobals and streams.

For WebSockets and WebTransports, the same considerations apply. The SAPI
API will have to be extended to deal with such low-level network layers,
worker mode or not. To me, this is very interesting (and needed) but should
be discussed in another RFC.

As pointed out by Crell, FrankenPHP (and similar theoretical solutions)
starts as many workers as needed. This can be a fixed set of workers, as in
FrankenPHP, or a dynamic number of workers, similar to traditional FPM
workers.

FrankenPHP uses threads to parallelize request handling (to start several
instances of the worker script in parallel). Other techniques could be
used, for instance, in the future, we could use goroutines (which use a mix
of system threads and async IO, and goroutines are handled in a single
system thread:
https://github.com/golang/go/blob/master/src/runtime/HACKING.md#gs-ms-ps)
instead of threads, by adding a new backend to TSRM.

The global state is never reset in the same worker context, it is preserved
across requests, except for superglobals and streams, which are updated
with the data of the request being handled.

Superglobals are the PHP way to expose CGI-like data. Adding support for
other ways to do it such as proposed by WSGI, and/or new objects and the
like could be interesting, but again this isn't the scope of this proposal
which is narrow, and tries to reuse the existing infrastructure as much as
possible. The proposal is simple enough to support new ways if introduced
at some point in PHP, and the Symfony Runtime and Laravel Octane libraries
prove that it's possible to implement more advanced data structures
user-land on top of the existing superglobals infrastructure.

Regarding the infinite loop, we could indeed remove it using a few lines of
code. I hesitated to do that initially, but the loop gives more flexibility
by allowing the implementation of many features in user-land (like
restarting the worker after a fixed number of requests, when the memory
reaches a certain level, etc). Without this loop, all these 

Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-29 Thread Kévin Dunglas
On Fri, Dec 29, 2023 at 8:14 PM Rowan Tommins 
wrote:

> - FrankenPHP expects the user to manage the main event loop, repeatedly
> passing the server a function to be called once; it doesn't pass
> anything into or out of the userland handler, instead resetting global
> state to mimic a non-worker environment
> [https://frankenphp.dev/docs/worker/#custom-apps]
>

This isn't exact. FrankenPHP does manage the event loop (the Go runtime
manages it - through a channel - under the hood).
The frankenphp_handle_request() pauses the thread until the Go runtime
gives back control to the C thread (when a request is dispatched to this
worker).
It's actually very similar to WSGI.

As I explained in my previous messages, it's expected that other SAPIs
handle the event loop too (using the primitives provided by the language
they are written in).


> - RoadRunner doesn't use a callback at all, instead providing methods to
> await a request and provide a response; it directly uses PSR-7 and
> PSR-17 objects [https://roadrunner.dev/docs/php-worker/current/en]
> - OpenSwoole manages the main loop itself, and uses lifecycle events to
> interface to userland code; the HTTP 'Request' event is passed custom
> Request and Response objects
> [https://openswoole.com/docs/modules/swoole-http-server-on-request]
>

I already replied to Crell about that. It will totally possible to expose
more complex HTTP message objects in the future,
but PHP currently lacks such objects. The only things we have are
superglobals (which are more or less similar to CGI variables, as done in
WSGI) and streams. It's why we're using them.
If PHP adds a higher-level API at some point, we'll be able to upgrade this
part as every other part of the PHP code base. But it's an unrelated topic:
having such higher-level representations of HTTP messages would be
beneficial both in "normal" and in "worker" mode.


> it would be adapted for an async PHP environment, or with WebSockets,
> for instance.
>

I'm not sure what you mean by "async PHP environment".
WebSockets and WebTransport are a different kind of beast, they are much
lower level than HTTP and will require a different API anyway (and probably
a lot of other adaptations in core) to be supported in PHP.
In Go, for instance, the WebSocket and WebTransport APIs aren't the same as
the HTTP API.

Best regards,


Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-25 Thread Kévin Dunglas
On Mon, Dec 25, 2023 at 7:56 PM Jordan LeDoux 
wrote:

>
>
> On Mon, Dec 25, 2023 at 8:19 AM Kévin Dunglas  wrote:
>
>>
>> On Sun, Dec 24, 2023 at 10:44 PM Jordan LeDoux 
>> wrote:
>>
>>>
>>>
>>> On Sat, Dec 23, 2023 at 12:34 PM Kévin Dunglas  wrote:
>>>
>>>> Hello and Merry Christmas!
>>>>
>>>> One of the main features of FrankenPHP is its worker mode, which lets
>>>> you
>>>> keep a PHP application in memory to handle multiple HTTP requests.
>>>>
>>>> Worker modes are becoming increasingly popular in the PHP world. Symfony
>>>> (Runtime Component), Laravel (Octane), and many projects based on these
>>>> frameworks (API Platform, Sulu...) now support a worker mode.
>>>>
>>>> In addition to FrankenPHP, projects such as RoadRunner and Swoole
>>>> provide
>>>> engines supporting worker modes.
>>>>
>>>> According to benchmarks, worker modes can improve the performance of PHP
>>>> applications by up to 15 times.
>>>> In addition to FrankenPHP, which is basically a SAPI for Go's integrated
>>>> web server, a new generation of SAPIs is currently under development.
>>>> Several SAPIs written in Rust (including one by the RoadRunner team) are
>>>> currently under development.
>>>>
>>>> These SAPIs, along with existing SAPIs, could benefit from a shared
>>>> infrastructure to build worker modes.
>>>>
>>>>
>>>>
>>>> The FrankenPHP code is written and should be easy to move around in PHP
>>>> itself, to enable other SAPIs to use it.
>>>>
>>>> In addition to sharing code, maintenance, performance optimization,
>>>> etc.,
>>>> the existence of a common infrastructure would standardize the way
>>>> worker
>>>> scripts are created and provide a high-level PHP API for writing worker
>>>> scripts that work with all SAPIs that rely on this new feature.
>>>>
>>>> SAPIs will still have to handle fetching requests from the web server
>>>> and
>>>> pausing the worker to wait for new requests (in FrankenPHP, we use
>>>> GoRoutines for this, in Rust or C, other primitives will have to be
>>>> used),
>>>> but almost everything else could be shared.
>>>>
>>>> For reference, here's the FrankenPHP code I propose to integrate into
>>>> libphp:
>>>> https://github.com/dunglas/frankenphp/blob/main/frankenphp.c#L245
>>>>
>>>> The public API is documented here:
>>>> https://frankenphp.dev/docs/worker/#custom-apps
>>>>
>>>> I'd like to hear what the community thinks about this. Would you be
>>>> interested in this functionality in PHP? Should I work on an RFC?
>>>>
>>>> If there's interest, I can work on a patch.
>>>>
>>>> Cheers,
>>>> --
>>>> Kévin Dunglas
>>>>
>>>
>>> Much like Larry, I'm curious what sort of scope you imagine for this.
>>> Are you imagining something that is geared specifically towards HTTP
>>> requests, or would this be a more generic "PHP Application Worker" that
>>> might be spawned to handle other types of applications? Could we have a
>>> worker listen to a specific port and respond to or handle all requests on
>>> that port/device?
>>>
>>> Jordan
>>>
>>
>> Ho Jordan,
>>
>> Yes, the scope I imagine is geared specifically towards HTTP requests.
>> Something more generic than common primitives for SAPIs and a shared public
>> API to handle HTTP requests with a long-running PHP worker script will be
>> hard to do outside of SAPIs because they depend on a lot of external
>> concerns such as the programming language the SAPI is using.
>>
>
> So you want to introduce a SAPI that doesn't work with any of the existing
> HTTP solutions people use that only supports HTTP requests? Or am I
> misunderstanding something?
>
> This sounds a bit like you want to merge in a tool that is designed for
> your personal product directly into core. FrankenPHP may be incredible and
> awesome, but the world runs on Apache and Nginx for HTTP requests.
>
> Jordan
>

As explained in the initial message and in my reply to Jakub, the main
targets are emerging SAPIs. We have no interest (quite the contrary) in
moving this code from FrankenPHP to PHP core (harder maintenance, slower
iterations as more collaboration will be involved...), but I do think that
having a "standard" and shared infrastructure and API for worker modes
between new generation SAPIs will be beneficial to the community as a whole
(no need - as at present - to write a different worker script for each
engine having a worker mode, sharing of optimizations, security patches
etc...).

We're talking roughly about a C function of a few dozen lines, not
something very big.


Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-25 Thread Kévin Dunglas
On Mon, Dec 25, 2023 at 6:30 PM Jakub Zelenka  wrote:

>
>
> On Mon, Dec 25, 2023 at 12:34 PM Kévin Dunglas  wrote:
>
>> On Sun, Dec 24, 2023 at 4:21 PM Larry Garfield 
>> wrote:
>>
>> In practice, I want to understand the implications for user-space code.
>> > Does this mean FPM could be configured in a way to execute a file like
>> that
>> > shown in the docs page above?  Or would it only work with third party
>> SAPIs
>> > like FrankenPHP?
>>
>>
>> In theory, PHP-FPM and the Apache module could - like all other SAPIs - be
>> enhanced to add a worker mode operating as described in the FrankenPHP doc
>> thanks to these new primitives.
>>
>
> I have been thinking about something similar for FPM and if you had some
> sort pool manager process, you could maybe do some sort of initial
> execution but then it gets really tricky especially with sharing resources
> and managing connections. I think it would be a big can of worms so I don't
> think this is going to happen anytime soon. I could imaging that there will
> be similar issues for Apache prefork which is likely the most used MPM for
> legacy apps. Effectively it means that this function won't be working on
> most installations as two of the likely most used SAPI's won't support it.
> I think it should be pretty clear from the beginning.
>
>
>> However, I suggest doing this as a second step, because as described in my
>> first post, it will still be the responsibility of each SAPI to manage
>> long-running processes and communication with them. This is simple to do
>> with Go's GoRoutine and Rust's asynchronous runtimes such as Tokio, it's
>> definitely more difficult in cross-platform C. I suggest starting by
>> adding
>> the primitives to libphp, then we'll see how to exploit them (and whether
>> it's worthwhile) in the built-in SAPIs.
>>
>
> The problem with this is that we would add some code that won't be used by
> any of the built in SAPI which means that that we won't be able to have
> automated tests for this. So the minimum should be to have at least one
> core SAPI supporting this new functionality. I wouldn't mind if it's just a
> SAPI for testing purpose which might be actually useful for testing embed
> SAPI code. I think that should be a requirement for accepting a PR
> introducing this.
>
> It would be also good to put together some base design PR for this as
> currently SAPI common functions are implemented separately in each SAPI
> (e.g. apache_request_headers). From the linked functionality, it is is not
> a big amount of code and seems somehow specific to the FrankenPHP so why
> couldn't each SAPI just implement this function separately? I know that
> this is not ideal but it's what is already used for apache_request_headers.
> I think otherwise you would need some hooking mechanism that should have
> some default (which would probably just throw exception) because it is not
> going to be implemented by all SAPI's. I think it would be really good if
> you could provide more details about planned implementation for this.
>
>
>> I personally have less interest in working on FPM/CGI/mod_php as the other
>> possibilities offered by modern SAPIs like FrankenPHP are more important
>> (better deployment experience as you have a single static binary or Docker
>> image, Early Hints support, high-quality native HTTP/3 server etc)
>>
>>
> Except that those are all threaded SAPIs so they offer less separation and
> protection against application crashes in addition to the fact that thread
> management in PHP still has got its own issues. They are certainly some
> advantages especially for thin services but if you have huge monolith
> codebase like some big CMS and other projects, then I would probably stick
> with process separation model.
>
> Cheers
>
> Jakub
>

Sure, the main targets are new SAPIs like FrankenPHP and the one in Rust
developed by the RoadRunner team. I thought it was clear in my previous
messages but I'll be glad to make it bold in the RFC.
Automated tests (likely through a test SAPI) will definitely be needed.
Throwing if the current SAPI doesn't support (yet) the new userland
function looks sensitive.

Couldn't this shared code be put in "main", as it could (theoretically, I
agree that it will be hard to do for existing core SAPIs) be used by all
SAPIs?


Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-25 Thread Kévin Dunglas
> Forgive my ignorance, but why no connection?  You mean the
> pre-worker-start part needs to avoid an SQL connection?  Why is that?  That
> would be something that needs to be super-well documented, and possibly
> some guards in place to prevent it, if there's no good way around it.
> (This is the sort of detail I'm thinking of, where I just don't know the
> implications but want to think through them as much as possible in advance,
> so that it can be "safe by design.")
>

Sorry, I made a typo. I mean "libraries must ensure that the connection
**is** active" (if the connection timeout has been reached, the library
must reconnect).
Your worker script will be long-running code, as in Java, Go, etc. So if it
depends on external services, it must check that the connection is still
active, and reconnect if necessary.
This is the default in most languages, but not in PHP (yet).

Do you have an intent or expectation of a worker-style SAPI being shipped
> with PHP itself, or for that to remain the domain of third parties?
>

As I tried to explain in my previous message, this could be nice, and
possible, but I don't plan to do it myself for now :)


> I mean more what implications would there be on how user-space code is
> written to be worker-SAPI-friendly.  (The SQL connection comment above, for
> example.)  I have not worked with any of the worker-ish tools so far
> myself, so other than "you'll need an alternate index.php for that", I
> don't have a good sense of what else I'd want to do differently to play
> nice with Franken and Friends.
>

As far as I know, there are no other implications than memory (and other
resources) leaks (https://laravel.com/docs/10.x/octane#managing-memory-leaks)
and timeout handling.


> The idea of combining fiber-based code with supported worker-mode runners
> sounds like a ridiculously cool future for PHP, but I don't know how windy
> that path is. :-)
>

That already works if you use FrankenPHP! Joe also experimented
successfully using the parallel extension instead of Fibers:
https://twitter.com/krakjoe/status/1587234661696245760


Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-25 Thread Kévin Dunglas
On Sun, Dec 24, 2023 at 10:44 PM Jordan LeDoux 
wrote:

>
>
> On Sat, Dec 23, 2023 at 12:34 PM Kévin Dunglas  wrote:
>
>> Hello and Merry Christmas!
>>
>> One of the main features of FrankenPHP is its worker mode, which lets you
>> keep a PHP application in memory to handle multiple HTTP requests.
>>
>> Worker modes are becoming increasingly popular in the PHP world. Symfony
>> (Runtime Component), Laravel (Octane), and many projects based on these
>> frameworks (API Platform, Sulu...) now support a worker mode.
>>
>> In addition to FrankenPHP, projects such as RoadRunner and Swoole provide
>> engines supporting worker modes.
>>
>> According to benchmarks, worker modes can improve the performance of PHP
>> applications by up to 15 times.
>> In addition to FrankenPHP, which is basically a SAPI for Go's integrated
>> web server, a new generation of SAPIs is currently under development.
>> Several SAPIs written in Rust (including one by the RoadRunner team) are
>> currently under development.
>>
>> These SAPIs, along with existing SAPIs, could benefit from a shared
>> infrastructure to build worker modes.
>>
>>
>>
>> The FrankenPHP code is written and should be easy to move around in PHP
>> itself, to enable other SAPIs to use it.
>>
>> In addition to sharing code, maintenance, performance optimization, etc.,
>> the existence of a common infrastructure would standardize the way worker
>> scripts are created and provide a high-level PHP API for writing worker
>> scripts that work with all SAPIs that rely on this new feature.
>>
>> SAPIs will still have to handle fetching requests from the web server and
>> pausing the worker to wait for new requests (in FrankenPHP, we use
>> GoRoutines for this, in Rust or C, other primitives will have to be used),
>> but almost everything else could be shared.
>>
>> For reference, here's the FrankenPHP code I propose to integrate into
>> libphp: https://github.com/dunglas/frankenphp/blob/main/frankenphp.c#L245
>>
>> The public API is documented here:
>> https://frankenphp.dev/docs/worker/#custom-apps
>>
>> I'd like to hear what the community thinks about this. Would you be
>> interested in this functionality in PHP? Should I work on an RFC?
>>
>> If there's interest, I can work on a patch.
>>
>> Cheers,
>> --
>> Kévin Dunglas
>>
>
> Much like Larry, I'm curious what sort of scope you imagine for this. Are
> you imagining something that is geared specifically towards HTTP requests,
> or would this be a more generic "PHP Application Worker" that might be
> spawned to handle other types of applications? Could we have a worker
> listen to a specific port and respond to or handle all requests on that
> port/device?
>
> Jordan
>

Ho Jordan,

Yes, the scope I imagine is geared specifically towards HTTP requests.
Something more generic than common primitives for SAPIs and a shared public
API to handle HTTP requests with a long-running PHP worker script will be
hard to do outside of SAPIs because they depend on a lot of external
concerns such as the programming language the SAPI is using.


Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-25 Thread Kévin Dunglas
On Sun, Dec 24, 2023 at 4:21 PM Larry Garfield 
wrote:

In practice, I want to understand the implications for user-space code.
> Does this mean FPM could be configured in a way to execute a file like that
> shown in the docs page above?  Or would it only work with third party SAPIs
> like FrankenPHP?


In theory, PHP-FPM and the Apache module could - like all other SAPIs - be
enhanced to add a worker mode operating as described in the FrankenPHP doc
thanks to these new primitives.

However, I suggest doing this as a second step, because as described in my
first post, it will still be the responsibility of each SAPI to manage
long-running processes and communication with them. This is simple to do
with Go's GoRoutine and Rust's asynchronous runtimes such as Tokio, it's
definitely more difficult in cross-platform C. I suggest starting by adding
the primitives to libphp, then we'll see how to exploit them (and whether
it's worthwhile) in the built-in SAPIs.
I personally have less interest in working on FPM/CGI/mod_php as the other
possibilities offered by modern SAPIs like FrankenPHP are more important
(better deployment experience as you have a single static binary or Docker
image, Early Hints support, high-quality native HTTP/3 server etc), but I'd
be happy to help if anyone wants to update these SAPIs.

I assume the handler function would be differently named.


I suggest naming the function handle_request() or something similar and
using the same name for all SAPIs, so the same worker script will work
everywhere. I'll update FrankenPHP to use the "standard" name.


> Is passing in super-globals the right/best way to handle each request, or
> would it be sensible to have some other abstraction there?  (Whether a
> formal request object a la PSR-7 or something else.)


Passing super-globals is at the same time the most interoperable solution
(it allows using almost all existing PHP libraries in worker mode without
any change to them), and also allows to reuse of the existing C code.
Transforming super-globals in HttpFoundation, PSR-7, or other objects is
straightforward and can entirely be done userland (it's already what the
Symfony Runtime Component and Laravel Octane do), so there is no need to
"bloat" the C code.

Having more high-level data structures to manipulate HTTP messages similar
to HttpFoundation or PSR-7 in the language could be nice (and is in my
opinion needed), but is a separate topic.
If PHP adds a new abstraction for that at some point, it will be easy to
add support for them both in standard and worker mode.


> To what extent would user-space code run this way have to think about
> concurrency, shared memory, persistent SQL connections, etc?  Does it have
> any implications for fiber-using async code?
>

Regarding concurrency, it doesn't change much (it's similar to existing
SAPI). Regarding memory and SQL connections, extra care is required. Memory
leaks (and other kinds of leaks) should be avoided (or workers should
restart from time to time, which is obviously a poorer solution). Libraries
maintaining SQL connections such as Doctrine or Eloquent must ensure that
the connection isn't active.
The good news is that thanks to RoadRunner, Swoole, Laravel Octane, Symfony
Runtime etc... Most popular libraries are already compatible with
long-running processes, and most issues have been fixed.
Some old apps and libraries will probably never be updatable, but that's
not a big issue because this feature will be entirely opt-in.

Fibers work as expected. There is a small limitation when using them with
Go (that is being tracked in the Go runtime,
https://frankenphp.dev/docs/known-issues/#fibers), but it's not related to
the C code of the worker mode, and this limitation shouldn't exist for
SAPIs not written in Go.


> Depending on the details, this could be like fibers but for 3rd party
> SAPIs (something about 4 people in the world actually care about directly,
> everyone else just uses Revolt, Amp, or React, but mostly it doesn't get
> used), or completely changing the way 90% of the market runs PHP, which
> means frameworks will likely adapt to use that model primarily or
> exclusively (ie, less of a need for a "compile" step as a generated
> container or dispatcher is just held in memory automatically already).  The
> latter sounds exciting to me, but I'm not sure which is your intent, so I
> don't know if I'm going too far with it. :-)
>

My intent is that most SAPIs expose the same (or a very similar
interoperable) worker mode. So (I hope) that most PHP developers will not
have to deal with these primitives directly, but that it will allow a new
generation of super-fast PHP apps to be created. Most frameworks already
support that but require a lot of boilerplate code to support the different
existing engines. Standardizing will likely increase adoption and will
allow collaboration to make the low-level code that I propose to move in
libphp as fast, stable, and clean as possible.


> 

[PHP-DEV] RFC proposal: worker mode primitives for SAPIs

2023-12-23 Thread Kévin Dunglas
Hello and Merry Christmas!

One of the main features of FrankenPHP is its worker mode, which lets you
keep a PHP application in memory to handle multiple HTTP requests.

Worker modes are becoming increasingly popular in the PHP world. Symfony
(Runtime Component), Laravel (Octane), and many projects based on these
frameworks (API Platform, Sulu...) now support a worker mode.

In addition to FrankenPHP, projects such as RoadRunner and Swoole provide
engines supporting worker modes.

According to benchmarks, worker modes can improve the performance of PHP
applications by up to 15 times.
In addition to FrankenPHP, which is basically a SAPI for Go's integrated
web server, a new generation of SAPIs is currently under development.
Several SAPIs written in Rust (including one by the RoadRunner team) are
currently under development.

These SAPIs, along with existing SAPIs, could benefit from a shared
infrastructure to build worker modes.



The FrankenPHP code is written and should be easy to move around in PHP
itself, to enable other SAPIs to use it.

In addition to sharing code, maintenance, performance optimization, etc.,
the existence of a common infrastructure would standardize the way worker
scripts are created and provide a high-level PHP API for writing worker
scripts that work with all SAPIs that rely on this new feature.

SAPIs will still have to handle fetching requests from the web server and
pausing the worker to wait for new requests (in FrankenPHP, we use
GoRoutines for this, in Rust or C, other primitives will have to be used),
but almost everything else could be shared.

For reference, here's the FrankenPHP code I propose to integrate into
libphp: https://github.com/dunglas/frankenphp/blob/main/frankenphp.c#L245

The public API is documented here:
https://frankenphp.dev/docs/worker/#custom-apps

I'd like to hear what the community thinks about this. Would you be
interested in this functionality in PHP? Should I work on an RFC?

If there's interest, I can work on a patch.

Cheers,
-- 
Kévin Dunglas


Re: [PHP-DEV] Set register_argc_argv to Off by default

2023-11-07 Thread Kévin Dunglas
This change seems reasonable to me: safer, with little chance of breaking
things, and easy to reverse for the end user by changing a single parameter.


[PHP-DEV] Bad interactions between Fibers and GoRoutines (and/or cgo)

2023-08-18 Thread Kévin Dunglas
Hi there,

We are experiencing strange problems with Fibers when running PHP with
FrankenPHP.

Fibers sometimes interact badly with the Go runtime on Linux x66 or amd64
(especially in Docker containers) and lead to crashes. We've tried many
things: compiling with --disable-fiber-asm, compiling with -fsplit-stack,
and increasing the system stack size limit but crashes always occur.
This looks related to how Go and Fibers manipulate the stack.

Here is a detailed bug report: https://github.com/golang/go/issues/62130
And the reproducer: https://github.com/dunglas/frankenphp/pull/171

Does anyone have any idea what's going on?

Best regards,
-- 
Kévin Dunglas


[PHP-DEV] Proposal to incrementally improve timeout and signal handling

2022-10-20 Thread Kévin Dunglas
Hello Internals,

PHP suffers from several issues related to timeout and signal handling,
especially when built with ZTS enabled.

1. The current implementation of timeouts on UNIX builds seems
"fundamentally incompatible with ZTS" (
https://bugs.php.net/bug.php?id=79464#1589205685) and more anecdotally
conflicts with some Go features (
https://github.com/golang/go/issues/56260#issuecomment-1281040802)
2. "Zend Signals" causes segmentation faults and other problems in
multi-threaded environments (
https://github.com/php/php-src/issues/9649#issuecomment-1264330874,
https://github.com/php/php-src/pull/5591#issuecomment-650064098), and seems
useless anyway since PHP 7.1 (
https://github.com/php/php-src/pull/5591#issuecomment-645428002)

In 2020, Alex Dowad started a major refactoring to improve these parts (
https://github.com/php/php-src/pull/5570,
https://github.com/php/php-src/pull/5591,
https://github.com/php/php-src/pull/5710), but he stopped working on it.

Instead of doing a major one-time refactoring like that, I propose moving
forward little by little to limit the risks and the potential backward
compatibility breaks.

Here is the plan:

1. Switch to timer_create() for timeouts, but only on Linux (it's not
supported on other platforms yet), when ZTS is enabled and Zend Signals is
disabled. As the feature is currently entirely broken when ZTS is on, this
can be considered a bug fix and cannot be worse than it currently is.
1bis. Can be done independently of 1., and even in parallel, optional as
long as we keep the --disable-zend-signals flag: Remove Zend Signals
entirely, because even if it can be partially fixed (I proposed a patch
fixing segfaults: https://github.com/php/php-src/pull/9766), it seems now
useless and causes unfixable issues with some signals such as SIGINT (
https://github.com/php/php-src/issues/9649#issuecomment-1265811930)
2. Switch to Grand Central Dispatch on macOS and FreeBSD when ZTS is
enabled and Zend Signals is disabled (if not removed at this point), which
provides a feature similar to timer_create() for these platforms.
3. Probably in a future major version, optional: switch to
timer_create()/GCD even for non-ZTS builds to uniformize and simplify the
code.

What do you think about this plan? Apart from the technical aspects, what's
the best way forward? Submit patches? Propose an RFC? Do both? (pardon my
ignorance of internals processes).

Thank you,
-- 
Kévin Dunglas


[PHP-DEV] Set SA_ONSTACK in zend_sigaction External

2022-09-22 Thread Kévin Dunglas
Hi, internals!

It's been a while.

I'm currently working on a new SAPI for web servers written in Go.
Many virtual machines, including Go (
https://pkg.go.dev/os/signal#hdr-Go_programs_that_use_cgo_or_SWIG), depend
on signals using SA_ONSTACK (
https://man7.org/linux/man-pages/man2/sigaltstack.2.html). This flag allows
a thread to define a new alternate signal stack. Many argue that SA_ONSTACK
should be a default, but it's not the case (yet).

Python merged a patch setting SA_ONSTACK in 2021 (Python 3.10+) for the
same reasons (https://bugs.python.org/issue43390 /
https://github.com/python/cpython/commit/02ac6f41e5569ec28d625bb005155903f64cc9ee),
with no issues.

I opened a Pull Request to set this flag by default and tested it
successfully with my Go SAPI: https://github.com/php/php-src/pull/9597

As this is technically at the limit between a new feature and a bug fix
(having the ability to call Go/C++ VM code from PHP and embed PHP in such
programs), should I open an RFC? Also, if merging my patch is considered,
which branch should I target?

Cheers,
-- 
Kévin Dunglas


[PHP-DEV] Add support for ::class to constant()

2021-03-09 Thread Kévin Dunglas
Hi folks,

Currently, it's not possible to use the ::class special constant with the
constant() function. This doesn't work:

var_dump(
  constant('\DateTime::class')
);

For instance, Twig's constant() helper internally uses this PHP function,
consequently the following Twig template doesn't work:

`myObject` contains a random object, retrieve its class:
{{ constant('class', myObject) }}

I wrote a patch adding support for ::class:
https://github.com/php/php-src/pull/6763
As this probably qualifies as a new feature, should I write an RFC too?

Cheers,


[PHP-DEV] Re: hash_equals: leak less information about length

2015-01-06 Thread Kévin Dunglas
Hello internals,

I've submitted this PR a long time ago:
https://github.com/php/php-src/pull/792

I still think it's a good idea to mitigate the length leak (rather than
returning immediately if strings are not of the same length) while
advertising in docs that the length will leak in any case.

php.net doc has been fixed, but - for instance - this is not the case of
the Symfony doc:
http://symfony.com/doc/current/components/security/secure_tools.html (this
method internally use hash_equals, I've just submitted a PR to fix this doc
but I'm sure there is a lot of other misuses in the wild).

To summarize: a theoretical (especially for web apps, more annoying for CLI
apps) and advertised leak is better than a big undocumented leak. Can you
merge this PR?

2014-08-31 12:59 GMT+02:00 Kévin Dunglas dung...@gmail.com:

 Hi,

 I've submitted a PR to make the hash_equals function leak less information
 about compared strings' lengths (benchmark and use cases available in
 comments): https://github.com/php/php-src/pull/792

 Trying to hide length is needed to replace Symfony and Joomla PHP
 implementations by hash_equals (when available).

 The idea:
 - clearly advert in the documentation that this function can potentially
 leak lengths
 - Try to make it harder for an attacker by using a robuster implementation.

 If there there is an agreement to use this kind of implementation, I'll
 rework the PR to use some tricks from the CPython one (
 https://github.com/python/cpython/blob/c7688b44387d116522ff53c0927169db45969f0e/Modules/_operator.c#L175
 - use of volatile and no modulo).

 Best regards,
 --
 Kévin Dunglas

 http://dunglas.fr




-- 
Kévin Dunglas
Consultant et développeur freelance

http://dunglas.fr
Tél. : 06 60 91 20 20


[PHP-DEV] Fixed Bug #65576 (Constructor from trait conflicts with inherited constructor)

2014-12-08 Thread Kévin Dunglas
Hi,

I've published a patch for bug #65576 :
https://github.com/php/php-src/pull/946
Can you review it and merge it please ?

Best regards.
-- 
Kévin Dunglas

http://dunglas.fr
http://les-tilleuls.coop


Re: [PHP-DEV] Fixed Bug #65576 (Constructor from trait conflicts with inherited constructor)

2014-12-08 Thread Kévin Dunglas
I've just implemented what it's described in the linked bug (not my report
but my team has the same issue): https://bugs.php.net/bug.php?id=65576

The rationale is: it works the same way for all magic methods except for
the constructor and it seems to be a regression introduced in the fix of
another bug (see comments in the bug tracker).

2014-12-08 16:17 GMT+01:00 Levi Morrison le...@php.net:

  I've published a patch for bug #65576 :
  https://github.com/php/php-src/pull/946
  Can you review it and merge it please ?

 Are we sure that's that correct behavior? Can you provide some
 rationale for why it should happen this way?




-- 
Kévin Dunglas
Consultant et développeur freelance

http://dunglas.fr
Tél. : 06 60 91 20 20


Re: [PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL

2014-11-14 Thread Kévin Dunglas
I've just pushed some changes in the PR. FILTER_VALIDATE_DOMAIN now checks
characters validity only if FILTER_FLAG_HOSTNAME is set. I've also rebased
and fixed some issues detailed on GitHub.

Yasuo, it's not trivial to use this new validator in FILTER_VALIDATE_EMAIL.
Its current implementation use a big regex that doesn't extract the domain
part. Anyway, having a good RFC compliant email validator cannot be done
with a regex. See https://github.com/egulias/EmailValidator for instance. I
think it's a work for another PR. I'll keep the email validator in it's
current state for now.

Do you guys are OK to get the current PR merged?

2014-11-12 19:10 GMT+01:00 Kévin Dunglas dung...@gmail.com:

 Hi Yasuo,

 I've not changed (and even read) the email validator. I'll take a look at
 it.

 2014-11-12 10:41 GMT+01:00 Yasuo Ohgaki yohg...@ohgaki.net:

 Hi Kevin,

 On Wed, Nov 12, 2014 at 4:09 PM, Kévin Dunglas dung...@gmail.com wrote:

 I'll change my PR according to the RFC I've quoted earlier:
 - check for valid characters (excluding underscore) only when
 FILTER_FLAG_HOSTNAME is set
 - allow any character but check lengths by default
 - use FILTER_FLAG_HOSTNAME to validate URLs

 What do you think about that?


 I haven't read diff closely, but it seems ok to me.
 How email domain is checked? I cannot see changes for it from the diff.

 Validating host correctly is difficult, I would like to have your PR.

 Regards,

 --
 Yasuo Ohgaki
 yohg...@ohgaki.net




 --
 Kévin Dunglas

 http://dunglas.fr




-- 
Kévin Dunglas
Consultant et développeur freelance

http://dunglas.fr
Tél. : 06 60 91 20 20


[PHP-DEV] Re: IDN support in streams

2014-11-14 Thread Kévin Dunglas
Hi,

Can a wiki admin give me RFC creation right? My wiki username is: dunglas
I'll submit an RFC for IDN support. It will require adding ICU as a core
dependency (previous discussion here
http://marc.info/?l=php-internalsm=141107203812897 and on GitHub).

Thanks!

2014-11-05 8:34 GMT+01:00 Kévin Dunglas dung...@gmail.com:

 Hello,

 I've submitted a PR to add IDN support in PHP streams. The way it's done
 will allow easy IDN domain validation in ext/filter too.

 Can you review this PR please?

 https://github.com/php/php-src/pull/890

 --
 Kévin Dunglas
 Consultant et développeur freelance

 http://dunglas.fr
 Tél. : 06 60 91 20 20




-- 
Kévin Dunglas
Consultant et développeur freelance

http://dunglas.fr
Tél. : 06 60 91 20 20


Re: [PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL

2014-11-12 Thread Kévin Dunglas
Hi Yasuo,

I've not changed (and even read) the email validator. I'll take a look at
it.

2014-11-12 10:41 GMT+01:00 Yasuo Ohgaki yohg...@ohgaki.net:

 Hi Kevin,

 On Wed, Nov 12, 2014 at 4:09 PM, Kévin Dunglas dung...@gmail.com wrote:

 I'll change my PR according to the RFC I've quoted earlier:
 - check for valid characters (excluding underscore) only when
 FILTER_FLAG_HOSTNAME is set
 - allow any character but check lengths by default
 - use FILTER_FLAG_HOSTNAME to validate URLs

 What do you think about that?


 I haven't read diff closely, but it seems ok to me.
 How email domain is checked? I cannot see changes for it from the diff.

 Validating host correctly is difficult, I would like to have your PR.

 Regards,

 --
 Yasuo Ohgaki
 yohg...@ohgaki.net




-- 
Kévin Dunglas

http://dunglas.fr


Re: [PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL

2014-11-11 Thread Kévin Dunglas
Hi,

I'll change my PR according to the RFC I've quoted earlier:
- check for valid characters (excluding underscore) only when
FILTER_FLAG_HOSTNAME is set
- allow any character but check lengths by default
- use FILTER_FLAG_HOSTNAME to validate URLs

What do you think about that?

Best regards,

2014-11-12 7:38 GMT+01:00 Yasuo Ohgaki yohg...@ohgaki.net:

 Hi all,

 On Fri, Nov 7, 2014 at 6:48 AM, Sanford Whiteman figureone...@gmail.com
 wrote:

  FWIW, there *is* a practical in-use (de facto if nothing else)
 convention of using _ in hosts for DKIM:

 _domainkey is actually in all the DKIM RFCs and in the formal STD 76,
 see § 3.6.2.1. Namespace, so it's more than a convention!


 _ is used for service name. Active Directory uses _ a lot, for
 example. e.g. _tcp, _sites, _ldap, etc.

 https://tools.ietf.org/html/rfc2782

 Regards,

 --
 Yasuo Ohgaki
 yohg...@ohgaki.net




-- 
Kévin Dunglas

http://dunglas.fr


Re: [PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL

2014-11-06 Thread Kévin Dunglas
FILTER_VALIDATE_DOMAIN checks conformance with DNS RFCs : total length,
label length and allowed characters (_ are allowed in domain names but many
other characters are forbidden such as ~/+...). I'll add IDN support too
when IDN support for streams will be merged.

FILTER_VALIDATE_URL checks conformance with URL RFCs (and not URI, as
discussed on GitHub). URL's host part RFCs conformance implies DNS RFCs
conformance, IPv4 and IPv6 RFCs conformance + some additional checks (no
underscore allowed in hostnames and IPv6 enclosed with brackets for
instance). It's why I've added the convenience flag FILTER_FLAG_HOSTNAME.
Btw, there is many use case for validating that a string is a valid domain
(or a valid hostname): hoster and registar apps, mail server management
apps and anything else DNS related.

Maybe be can we find a better name for FILTER_VALIDATE_DOMAIN. Such as
FILTER_VALIDATE_DOMAIN_NAME
or FILTER_VALIDATE_DNS_DOMAIN (a bit redundant, DNS = Domain Name System) but
please not something related DNS Record because a valid DNS record can
have the following format:

les-tilleuls.coop. 3600 IN SOA monsite.nnx.com .root.monsite.nnx.com. (
2014092300 ; serial
21600 ; refresh (6 hours)
3600 ; retry (1 hour)
604800 ; expire (1 week)
86400 ; minimum (1 day)
)


2014-11-06 13:55 GMT+01:00 Andrey Andreev n...@devilix.net:

 Hi,

 On Thu, Nov 6, 2014 at 8:19 AM, Kévin Dunglas dung...@gmail.com wrote:
  Hi Andrey,
 
  Sorry but I think you're wrong. Domain != hostname. Underscore are
 allowed
  in domains (RFC 2181) but not in hostnames (RFC 1123 and next). To quote
  Wikipedia:
 
  While a hostname may not contain other characters, such as the
 underscore
  character (_), other DNS names may contain the underscore. Systems such
  asDomainKeys and service records use the underscore as a means to assure
  that their special character is not confused with hostnames. For
  example,_http._sctp.www.example.com specifies a service pointer for an
 SCTP
  capable webserver host (www) in the domain example.com.
  http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names
 
  You can also see this StackOverflow answer
  http://stackoverflow.com/a/2183140/1352334

 I agree to an extent, but that is highly contextual.

 Who said that 'domain' === 'DNS record' (which is a very broad term
 anyway)? And IF we assume this, why do you need FILTER_VALIDATE_DOMAIN
 for it if it's only going to check length?

 Cheers,
 Andrey.




-- 
Kévin Dunglas
Consultant et développeur freelance

http://dunglas.fr
Tél. : 06 60 91 20 20


Re: [PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL

2014-11-05 Thread Kévin Dunglas
Hi,

According to the discussion on GitHub, I've made some changes on this PR:
- Added a new FILTER_VALIDATE_DOMAIN filter validating domain names
- Added a FILTER_FLAG_HOSTNAME flag to allow checking hostnames (_ are
forbidden in hostname but not in domains)
- Changed FILTER_VALIDATE_URL to use this new validator

When https://github.com/php/php-src/pull/890 will be merged, it will be
easy to add IDN support to this new domain validator.


2014-10-14 13:48 GMT+02:00 Daniel Ribeiro drgom...@gmail.com:

 Nice work man, it looks really good.


 Daniel Ribeiro
 http://danielribeiro.org

 On Tue, Oct 14, 2014 at 3:41 PM, Kévin Dunglas dung...@gmail.com wrote:

 Hi,

 I opened a PR making FILTER_VALIDATE_URL more strict and more compliant
 with standards: https://github.com/php/php-src/pull/826

 Can anyone review (and merge) this patch?

 Thanks!
 --
 Kévin Dunglas
 Consultant et développeur freelance

 http://dunglas.fr
 Tél. : 06 60 91 20 20





--
Kévin Dunglas
Consultant et développeur freelance

http://dunglas.fr
Tél. : 06 60 91 20 20


Re: [PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL

2014-11-05 Thread Kévin Dunglas
Hi Andrey,

Sorry but I think you're wrong. Domain != hostname. Underscore are allowed
in domains (RFC 2181) but not in hostnames (RFC 1123 and next). To quote
Wikipedia:

While a hostname may not contain other characters, such as the underscore
character (_), other DNS names may contain the underscore. Systems such
asDomainKeys and service records use the underscore as a means to assure
that their special character is not confused with hostnames. For
example,_http._sctp.www.example.com specifies a service pointer for an SCTP
capable webserver host (www) in the domain example.com.
http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names

You can also see this StackOverflow answer
http://stackoverflow.com/a/2183140/1352334

2014-11-06 0:32 GMT+01:00 Andrey Andreev n...@devilix.net:

 Hi,

 On Wed, Nov 5, 2014 at 11:57 PM, Kévin Dunglas dung...@gmail.com wrote:
 
  - Added a new FILTER_VALIDATE_DOMAIN filter validating domain names
  - Added a FILTER_FLAG_HOSTNAME flag to allow checking hostnames (_ are
  forbidden in hostname but not in domains)

 This doesn't make any sense. A domain *is* a hostname and underscores
 are forbidden.

 Cheers,
 Andrey.




-- 
Kévin Dunglas
Consultant et développeur freelance

http://dunglas.fr
Tél. : 06 60 91 20 20


[PHP-DEV] IDN support in streams

2014-11-04 Thread Kévin Dunglas
Hello,

I've submitted a PR to add IDN support in PHP streams. The way it's done
will allow easy IDN domain validation in ext/filter too.

Can you review this PR please?

https://github.com/php/php-src/pull/890

-- 
Kévin Dunglas
Consultant et développeur freelance

http://dunglas.fr
Tél. : 06 60 91 20 20


Re: [PHP-DEV] Internationalized Domain Name support in FILTER_VALIDATE_URL

2014-10-15 Thread Kévin Dunglas
Hi Chris,

I've just blogged about IDN support in PHP. This post include a (tiny)
userland implementation of streams:
http://dunglas.fr/2014/10/internationalized-domain-name-idn-and-php/

What do you think about the following to add native support :
1. As already stated, make ICU a dependency of core
2. Convert the host returned by php_parse_url here
https://github.com/php/php-src/blob/master/ext/standard/http_fopen_wrapper.c#L154
to Punycode with
http://icu-project.org/apiref/icu4c432/uidna_8h.html#a711fa1d2e6dd25d7368f5b3ea2aaedc6

It looks not so intrusive and relatively easy to implement. According to
RFC I quote in the blog post, it should work with SSL too. I can make a PR
(or a RFC if needed) with this method if it seems applicable.

Best regards,


2014-09-24 8:33 GMT+02:00 Pierre Joye pierre@gmail.com:

 On Wed, Sep 24, 2014 at 2:48 AM, Stas Malyshev smalys...@sugarcrm.com
 wrote:
  Hi!
 
  I'll implement optional (and not default) support of IDN in
 filter_var().
 
  Does anyone known if it's better to use libIDN (LGPL) or ICU (custom
  license deviated from the X license) from a license point of view?
 
  ICU is definitely better since we already have a lot of code using ICU
  and AFAIK our current IDN functions (idn_to_*) use ICU. Which means it
  would be advantageous to keep it in the single library - whatever bugs
  there may be, at least the user will be dealing with one set of bugs
  instead of two :)

 Indeed :)

 However I am not sure yet we should do it, or at least not by default.
 It may introduce side effects or BC issues.While IDN is bi-directional
 or could be called many times and returning the same result, we have
 to be careful to do not break things out there, for example someone
 relying on it to process URI/URL.

 Cheers,
 --
 Pierre

 @pierrejoye | http://www.libgd.org




-- 
Kévin Dunglas
Consultant et développeur freelance

http://dunglas.fr
Tél. : 06 60 91 20 20


[PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL

2014-10-14 Thread Kévin Dunglas
Hi,

I opened a PR making FILTER_VALIDATE_URL more strict and more compliant
with standards: https://github.com/php/php-src/pull/826

Can anyone review (and merge) this patch?

Thanks!
-- 
Kévin Dunglas
Consultant et développeur freelance

http://dunglas.fr
Tél. : 06 60 91 20 20


Re: [PHP-DEV] #68049 filter_var echo wrong result for a url

2014-09-22 Thread Kévin Dunglas
Some browsers do. Some versions of IE are buggy when the URL include
underscores:
http://stackoverflow.com/questions/794243/internet-explorer-ignores-cookies-on-some-domains-cannot-read-or-set-cookies

I think that filter_var must follow the RFC by default. Maybe can we add a
flag to allow malformed URL in use in the wild?



2014-09-21 10:42 GMT+02:00 Florian Margaine flor...@margaine.com:

 Hi,

 According to https://bugs.php.net/bug.php?id=51192 , valid URLs cannot
 contain underscores.

 The following bug was reported a couple days ago:
 https://bugs.php.net/bug.php?id=68049

 The thing is, browsers *do* accept the underscore in URLs. Should the
 rfc3986 http://tools.ietf.org/html/rfc3986#section-3.2.2 be respected,
 or
 should PHP be lenient like browsers and accept more?

 Regards,

 *Florian Margaine*




-- 
Kévin Dunglas
Consultant et développeur freelance

http://dunglas.fr
Tél. : 06 60 91 20 20


Re: [PHP-DEV] Internationalized Domain Name support in FILTER_VALIDATE_URL

2014-09-22 Thread Kévin Dunglas
I'll implement optional (and not default) support of IDN in filter_var().

Does anyone known if it's better to use libIDN (LGPL) or ICU (custom
license deviated from the X license) from a license point of view?

2014-09-19 16:18 GMT+02:00 Chris Wright c...@daverandom.com:

 On 19 September 2014 14:48, Kévin Dunglas dung...@gmail.com wrote:
  Support of IDN in streams is a must have.
  But there is a lot of other use cases for URL with IDN validation. The
 most
  common is probably form validation (test if an user submitted URL has a
  valid format and can be used to create an HTML link...).
 
  I'm ok making IDN validation optional and not used by default until PHP
  natively support IDN in other features such as streams.
  But IDN are used more and more in the wild, and from a user point of
 view it
  is disappointing that a valid URL, working in browsers and even
 displayed by
  Google Search is not considered as a valid URL by a PHP-based website
 using
  filter_var() without a specific flag.
 
  Even some TLD are using non-ASCII characters, exemple: http://旅游气象.中国
 http://xn--zfv73l7xbp87c.xn--fiqs8s
  (popular Chinese weather site).
 
  About the library, I've not preference between libidn and icu. If the
  licence is libidn fit better with the PHP one, libidn is probably the
 better
  choice. Having a PHP specific implementation of STRINGPREP and Punnycode
  sounds not like a good idea (reinventing the wheel, more code to
 maintain).
 
  Chris, is there a chance to have your work on streams merged in PHP 7?

 It's very hacky and PoC at the moment. I've got a bunch of
 time-consuming personal things going on right now, but within the next
 couple of weeks I will try and polish it up into something
 serviceable, maintainable and tested/less likely to explode with
 edge-cases and then I'll put it up for discussion.

 I'm also fine if someone else wants to have a crack in the meantime, I
 can push my work so far to github early next week when I get access to
 the machine.

 I'd certainly like the functionality to be in 7 if it's viable from a
 licensing and dependency PoV - I had been holding off bringing it up
 to see what happened with the more general unicode support discussion
 (which I somewhat lost track of and seems to have died out) as there
 was talk of introducing a hard dependency on ICU-or-similar at one
 point, which would have made this a no-brainer.

  What do you thing about the following planning:
  - 5.7 (if exists): add IDN support in filter disabled by default. Use
 libidn
  if selected to be used for streams too.
  - 7 (if IDN support for streams is completed): validate IDN by default
 (what
  the user expect), add a flag to disable IDN validation. Of course we'll
  update the doc explaining the new behavior.
 
  2014-09-19 12:28 GMT+02:00 Chris Wright c...@daverandom.com:
 
  On 19 September 2014 10:58, Pierre Joye pierre@gmail.com wrote:
   Hi,
  
   On Sep 19, 2014 4:03 PM, Chris Wright c...@daverandom.com wrote:
  
   Kévin
  
   On 18 September 2014 21:26, Kévin Dunglas dung...@gmail.com wrote:
Hello,
   
I'm working on enhancing the FILTER_VALIDATE_URL filter (
https://github.com/php/php-src/pull/826).
The current implementation does not support validation of
internationalized
domain names (i.e: http://www.académie-française.fr/
 http://www.xn--acadmie-franaise-npb1a.fr/
http://www.xn--acadmie-franaise-npb1a.fr/).
   
Support of IDN validation can be easily added using ICU's
uidna_toASCII()
function.
   
Is it acceptable to add a dependency to ICU for ext/filter?
Another option is to add a HAVE_ICU constant in main/php_config.h
 and
to
validate IDN only if ICU is present.
   
What strategy is preferred?
  
   I've done some work around this area previously, and all I will say
   is: be careful with what you do with this from a userland PoV.
  
   PHP does not natively support IDN in stream open routines or SSL
   verification routines. It will never support these things without at
   least one of:
   - a core dependency on ICU, libidn or similar
   - moving streams into an extension so a dependency can be introduced
   there (probably not sanely possible)
   - an in-house NAMEPREP implementation (this is the hard part of IDN,
   punycode itself is pretty trivial to implement once you have a
   canonical set of codepoints)
  
   These things can be implemented with *a lot* of boilerplate in
   userland when you have ext/intl, but it's not pretty. libcurl *can*
   support IDN if it was built against libidn, I'm not sure if this is
   currently the case in common distributions or not. Since one almost
   never just validates a URL string, it's usually a precursor to
   attempting to open it, this could lead to some pretty hefty wtfs.
  
   In short, while I'm generally for ext/filter being able to handle
 IDN,
   I *do not* believe it should do it implicitly, it should require an
   explicit flag, because

Re: [PHP-DEV] #68049 filter_var echo wrong result for a url

2014-09-22 Thread Kévin Dunglas
I've recently proposed a refactoring of FILTER_VALIDATE_URL:
https://github.com/php/php-src/pull/826
I can easily add the support of this new flag is everyone agree.

2014-09-22 9:09 GMT+02:00 Florian Margaine flor...@margaine.com:

 Oh, IE. *sigh*

 Adding a new flag sounds like a good idea indeed,
 `FILTER_VALIDATE_UNCOMPLIANT_URL` sounds good enough?

 I guess it should accept underscores and domain names starting with
 numbers too.

 Regards,

 *Florian Margaine*

 P.S: sorry Kevin for the double mail.
 Le 22 sept. 2014 09:03, Kévin Dunglas dung...@gmail.com a écrit :

 Some browsers do. Some versions of IE are buggy when the URL include
 underscores:
 http://stackoverflow.com/questions/794243/internet-explorer-ignores-cookies-on-some-domains-cannot-read-or-set-cookies


 I think that filter_var must follow the RFC by default. Maybe can we add
 a flag to allow malformed URL in use in the wild?



 2014-09-21 10:42 GMT+02:00 Florian Margaine flor...@margaine.com:

 Hi,

 According to https://bugs.php.net/bug.php?id=51192 , valid URLs cannot
 contain underscores.

 The following bug was reported a couple days ago:
 https://bugs.php.net/bug.php?id=68049

 The thing is, browsers *do* accept the underscore in URLs. Should the
 rfc3986 http://tools.ietf.org/html/rfc3986#section-3.2.2 be
 respected, or
 should PHP be lenient like browsers and accept more?

 Regards,

 *Florian Margaine*




 --
 Kévin Dunglas
 Consultant et développeur freelance

 http://dunglas.fr
 Tél. : 06 60 91 20 20




-- 
Kévin Dunglas
Consultant et développeur freelance

http://dunglas.fr
Tél. : 06 60 91 20 20


Re: [PHP-DEV] Internationalized Domain Name support in FILTER_VALIDATE_URL

2014-09-19 Thread Kévin Dunglas
Support of IDN in streams is a must have.
But there is a lot of other use cases for URL with IDN validation. The most
common is probably form validation (test if an user submitted URL has a
valid format and can be used to create an HTML link...).

I'm ok making IDN validation optional and not used by default until PHP
natively support IDN in other features such as streams.
But IDN are used more and more in the wild, and from a user point of view
it is disappointing that a valid URL, working in browsers and even
displayed by Google Search is not considered as a valid URL by a PHP-based
website using filter_var() without a specific flag.

Even some TLD are using non-ASCII characters, exemple: http://旅游气象.中国
http://xn--zfv73l7xbp87c.xn--fiqs8s (popular Chinese weather site).

About the library, I've not preference between libidn and icu. If the
licence is libidn fit better with the PHP one, libidn is probably the
better choice. Having a PHP specific implementation of STRINGPREP and
Punnycode sounds not like a good idea (reinventing the wheel, more code to
maintain).

Chris, is there a chance to have your work on streams merged in PHP 7?

What do you thing about the following planning:
- 5.7 (if exists): add IDN support in filter disabled by default. Use
libidn if selected to be used for streams too.
- 7 (if IDN support for streams is completed): validate IDN by default
(what the user expect), add a flag to disable IDN validation. Of course
we'll update the doc explaining the new behavior.

2014-09-19 12:28 GMT+02:00 Chris Wright c...@daverandom.com:

 On 19 September 2014 10:58, Pierre Joye pierre@gmail.com wrote:
  Hi,
 
  On Sep 19, 2014 4:03 PM, Chris Wright c...@daverandom.com wrote:
 
  Kévin
 
  On 18 September 2014 21:26, Kévin Dunglas dung...@gmail.com wrote:
   Hello,
  
   I'm working on enhancing the FILTER_VALIDATE_URL filter (
   https://github.com/php/php-src/pull/826).
   The current implementation does not support validation of
   internationalized
   domain names (i.e: http://www.académie-française.fr/
 http://www.xn--acadmie-franaise-npb1a.fr/
   http://www.xn--acadmie-franaise-npb1a.fr/).
  
   Support of IDN validation can be easily added using ICU's
   uidna_toASCII()
   function.
  
   Is it acceptable to add a dependency to ICU for ext/filter?
   Another option is to add a HAVE_ICU constant in main/php_config.h and
 to
   validate IDN only if ICU is present.
  
   What strategy is preferred?
 
  I've done some work around this area previously, and all I will say
  is: be careful with what you do with this from a userland PoV.
 
  PHP does not natively support IDN in stream open routines or SSL
  verification routines. It will never support these things without at
  least one of:
  - a core dependency on ICU, libidn or similar
  - moving streams into an extension so a dependency can be introduced
  there (probably not sanely possible)
  - an in-house NAMEPREP implementation (this is the hard part of IDN,
  punycode itself is pretty trivial to implement once you have a
  canonical set of codepoints)
 
  These things can be implemented with *a lot* of boilerplate in
  userland when you have ext/intl, but it's not pretty. libcurl *can*
  support IDN if it was built against libidn, I'm not sure if this is
  currently the case in common distributions or not. Since one almost
  never just validates a URL string, it's usually a precursor to
  attempting to open it, this could lead to some pretty hefty wtfs.
 
  In short, while I'm generally for ext/filter being able to handle IDN,
  I *do not* believe it should do it implicitly, it should require an
  explicit flag, because it will break *a lot* of code if IDN is
  suddenly treated as valid where it previously wasn't.
 
  I am really not sure about that especially the enabling by default part.
 
  The doc is pretty clear about what this filter supports and allowing idn
 may
  break a lot of codes out there.
 
  From an implementation point of view we may not need ICU to support IDN.
  Windows does not use it and there are license friendly decoder
  implementations too.

 If we can agree on adding a core dependency on some IDN support lib,
 I already have an experimental local branch that adds full IDN support
 to streams. It's based on libidn but it would be easy enough to swap
 it out for something else that provides the same functionality.

 In my (biased) opinion, streams are a far more important element of
 IDN support. Filter validation is just polish/a nicety on top.




-- 
Kévin Dunglas

http://dunglas.fr


[PHP-DEV] Internationalized Domain Name support in FILTER_VALIDATE_URL

2014-09-18 Thread Kévin Dunglas
Hello,

I'm working on enhancing the FILTER_VALIDATE_URL filter (
https://github.com/php/php-src/pull/826).
The current implementation does not support validation of internationalized
domain names (i.e: http://www.académie-française.fr/
http://www.xn--acadmie-franaise-npb1a.fr/).

Support of IDN validation can be easily added using ICU's uidna_toASCII()
function.

Is it acceptable to add a dependency to ICU for ext/filter?
Another option is to add a HAVE_ICU constant in main/php_config.h and to
validate IDN only if ICU is present.

What strategy is preferred?

--
Kévin Dunglas

http://dunglas.fr


[PHP-DEV] Re: Internationalized Domain Name support in FILTER_VALIDATE_URL

2014-09-18 Thread Kévin Dunglas
Hi,

The flag is a good idea to handle old systems but the feature must be
enabled by default (at least for PHP 7) and disablable through the flag.
IDN RFCs are more than 10 years old. All major browsers and registrars
support IDN.

Le vendredi 19 septembre 2014, Tjerk Meesters tjerk.meest...@gmail.com a
écrit :


 On 19 Sep 2014, at 06:52, Andrea Faulds a...@ajf.me javascript:; wrote:

 
  On 18 Sep 2014, at 21:26, Kévin Dunglas dung...@gmail.com
 javascript:; wrote:
 
  I'm working on enhancing the FILTER_VALIDATE_URL filter (
  https://github.com/php/php-src/pull/826).
  The current implementation does not support validation of
 internationalized
  domain names (i.e: http://www.académie-française.fr/
 http://www.xn--acadmie-franaise-npb1a.fr/
  http://www.xn--acadmie-franaise-npb1a.fr/).
 
  Support of IDN validation can be easily added using ICU's
 uidna_toASCII()
  function.
 
  Is it acceptable to add a dependency to ICU for ext/filter?
  Another option is to add a HAVE_ICU constant in main/php_config.h and to
  validate IDN only if ICU is present.
 
  What strategy is preferred?
 
  Perhaps add a new filter that covers normal URLs and IDN ones? I just
 imagine it might cause problems if suddenly IDNs are accepted, if there is
 a backend which can’t handle them.

 We don’t need a new filter, you can simply add a filter flag for
 FILTER_VALIDATE_URL, e.g. FILTER_FLAG_ALLOW_IDN.

 Of course, the ICU dependency should be optional :)

 
  --
  Andrea Faulds
  http://ajf.me/
 
 
 
 
 
  --
  PHP Internals - PHP Runtime Development Mailing List
  To unsubscribe, visit: http://www.php.net/unsub.php
 



-- 
Kévin Dunglas
Consultant et développeur freelance

http://dunglas.fr
Tél. : 06 60 91 20 20


[PHP-DEV] hash_equals: leak less information about length

2014-08-31 Thread Kévin Dunglas
Hi,

I've submitted a PR to make the hash_equals function leak less information
about compared strings' lengths (benchmark and use cases available in
comments): https://github.com/php/php-src/pull/792

Trying to hide length is needed to replace Symfony and Joomla PHP
implementations by hash_equals (when available).

The idea:
- clearly advert in the documentation that this function can potentially
leak lengths
- Try to make it harder for an attacker by using a robuster implementation.

If there there is an agreement to use this kind of implementation, I'll
rework the PR to use some tricks from the CPython one (
https://github.com/python/cpython/blob/c7688b44387d116522ff53c0927169db45969f0e/Modules/_operator.c#L175
- use of volatile and no modulo).

Best regards,
-- 
Kévin Dunglas

http://dunglas.fr