Re: [PHP-DEV] Switching max_execution_time from CPU time to wall-clock time and from SIGPROF to SIGALRM
Hello, I'm in favor of merging Arnaud's patch for macOS while waiting for a better solution like relying on Grand Central Dispatch or another non-signal-based solution (https://github.com/php/php-src/pull/13468), which would allow max_execution_time to work with ZTS builds on mac as well. I'm also in favor of using wall-clock time wherever possible (disclaimer: I'm the original author of this feature for Linux and FreeBSD). Best,
[PHP-DEV] Re: [proposal] max_execution_time to a negative number
According to "man 2 setitimer", the same error should happen on Linux even without zend_max_execution_timer: https://github.com/php/php-src/blob/2079da0158bc91fff4edd85ac66c89b40c4faf3a/Zend/zend_execute_API.c#L1566 A C error will also occur if the value is superior to 999,999,999. We should at least prevent the C error in such cases. I proposed a patch normalizing these values to 0: https://github.com/php/php-src/pull/13942 It's still better than the current situation, and can still be considered as "undefined" until the RFC is voted.
Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs)
Data classes will be a very useful addition to "API Platform". API Platform is a "resource-oriented" framework that strongly encourages the use of "data-only" classes: we use PHP classes both as a specification language to document the public shape of web APIs (like an OpenAPI specification, but written in PHP instead of JSON or YAML), and as Data Transfer Objects containing the data to be serialized into JSON (read), or the JSON payload deserialized into PHP objects (write). Being able to encourage users to use structs (that's what we already call this type of behavior-less class in our workshops) for these objects will help us a lot. Kévin
Re: [PHP-DEV] php-src docs
I strongly support this initiative. When I started writing SAPI (even though I already had some experience with the PHP code base), I spent a lot of time reading scattered articles on the subject, many of which were incomplete or outdated. Having a centralized place to search and contribute would make things a lot easier. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
[PHP-DEV] Re: RFC proposal: worker mode primitives for SAPIs
> Le 4 janv. 2024 à 18:21, Joanhey a écrit : > > Hi, > > I like it for start a discussion, than it's necessary. > But we need to see the big picture. > > The CLI-SAPI is the poor brother in PHP (contrary to other languages), but > that is another discussion than I'll try to open later. > > Create a Worker-SAPI? > > First any CLI worker can't access the SAPI. So they don't have any benefit. > So Amp, React, Revolt, Workerman, Adapterman, Symfony runtime,... can't > access the internal SAPI functions. Each need to recreate in user land PHP > code for functions that already exist in PHP sapis. > Kudos, for the RFC RFC1867 from Ilija. But we need to go farther. > It isn't possible use header functions :(, > https://github.com/php/php-src/issues/12304 > > > Later we have SAPIs than use PHP embed (really easy to use :)) or in a > similar way. > > Here we find 2 ways: forks or threads!! > With forks we can use Super-Globals, with threads it's impossible, and for > that they need to encapsulate it in Request/Response objects. > > How we'll join both situations. Here start the discussion. > > Forks: > Frankenphp, RoadRunner, Ngx-php (the fastest PHP runtime),... > Nginx Unit still use a shared nothing approach, but it's really easy to have > both. > > Threads: > Swoole, OpenSwoole, Swoow,... in that situation the super globals are NOT > possible. > > Here some frameworks permit use both (forks or threads) depending on the > master event loop that we choose. But they need to force all to the threads > way to have a unified interface. > > We are talking about the main loop, because inside we can use any thread > system. > > Thanks to all, and to Kevin to start the discussion. > > PD: Actually any new PHP SAPI need to be added to the php-src to have OPCache > enabled. Nginx Unit and other still use cli-server SAPI to have it. That need > to be changed, so any SAPI can call it, without register. > > Regards > Joan Miquel Thanks for the summary! For the record, FrankenPHP and NGINX Unit use threads, not forks (and recommend ZTS PHP builds). And as far as I understand, Swoole etc use a reactor pattern and non-blocking IOs, not threads. Also Symfony Runtime isn’t an engine but a library with adapters for FrankenPHP, RoadRunner, Bref etc. The main SAPI using libphp is the embed SAPI, but some non-core SAPIs including FrankenPHP, NGINX Unit and uWSGI use libphp with their own SAPIs. The confusing part is that you need to enable the embed SAPI through the configure options to build libphp, even if the embed SAPI itself isn’t used. Best regards, -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On Sun, Dec 31, 2023 at 2:20 AM Rowan Tommins wrote: > On 30 December 2023 19:48:39 GMT, Larry Garfield > wrote: > >The Franken-model is closer to how PHP-FPM works today, which means that > is easier to port existing code to, especially existing code that has lots > of globals or hidden globals. (Eg, Laravel.) That may or may not make it > the better model overall, I don't know, but it's the more-similar model. > > That's why I said earlier that it provides better backwards compatibility > - existing code which directly uses PHP's current global state can more > easily be run in a worker which populates that global state. > > However, the benefit is marginal, for two reasons. Firstly, because in > practice a lot of applications avoid touching the global state outside of > some request bootstrapping code anyway. The FrankenPHP example code and > Laravel Octane both demonstrate this. > > Secondly, because in an environment that handles a single request at a > time, the reverse is also possible: if the server passes request > information directly to a callback, that callback can populate the > superglobals as appropriate. The only caveat I can think of is input > streams, since userland code can't reset and populate php://input, or > repoint STDOUT. > > On the other hand, as soon as you have any form of concurrency, the two > models are not interchangeable - it would make no sense for an asynchronous > callback to read from or write to global state. > > And that's what I meant about FrankenPHP's API having poor forward > compatibility - if you standardise on an API that populates global state, > you close off any possibility of using that API in a concurrent > environment. If you instead standardise on callbacks which hold request and > response information in their own scope, you don't close anything off. > > If anything, calling this "forwards compatibility" is overly generous: the > OP gave Swoole as an example of an existing worker environment, but I can't > see any way that Swoole could implement an API that communicated request > and response information via global state. > > Regards, > > -- > Rowan Tommins > [IMSoP] > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php This new function is intended for SAPIs. Swoole was given as an example of worker mode, but it isn't a SAPI. AFAIK, it doesn't use the SAPI infrastructure provided by PHP. The scope of my proposal is only to provide a new feature in the SAPI infrastructure to build worker modes to handle HTTP requests, not to deal with non-SAPI engines. That being said, I don't understand what would prevent Swoole from implementing the proposed API, or even to implement a userland implementation of the proposed API using Swoole under the hood. It seems doable to emulate the sequential request handling and to create an adapter from their custom objects to superglobals and streams. For WebSockets and WebTransports, the same considerations apply. The SAPI API will have to be extended to deal with such low-level network layers, worker mode or not. To me, this is very interesting (and needed) but should be discussed in another RFC. As pointed out by Crell, FrankenPHP (and similar theoretical solutions) starts as many workers as needed. This can be a fixed set of workers, as in FrankenPHP, or a dynamic number of workers, similar to traditional FPM workers. FrankenPHP uses threads to parallelize request handling (to start several instances of the worker script in parallel). Other techniques could be used, for instance, in the future, we could use goroutines (which use a mix of system threads and async IO, and goroutines are handled in a single system thread: https://github.com/golang/go/blob/master/src/runtime/HACKING.md#gs-ms-ps) instead of threads, by adding a new backend to TSRM. The global state is never reset in the same worker context, it is preserved across requests, except for superglobals and streams, which are updated with the data of the request being handled. Superglobals are the PHP way to expose CGI-like data. Adding support for other ways to do it such as proposed by WSGI, and/or new objects and the like could be interesting, but again this isn't the scope of this proposal which is narrow, and tries to reuse the existing infrastructure as much as possible. The proposal is simple enough to support new ways if introduced at some point in PHP, and the Symfony Runtime and Laravel Octane libraries prove that it's possible to implement more advanced data structures user-land on top of the existing superglobals infrastructure. Regarding the infinite loop, we could indeed remove it using a few lines of code. I hesitated to do that initially, but the loop gives more flexibility by allowing the implementation of many features in user-land (like restarting the worker after a fixed number of requests, when the memory reaches a certain level, etc). Without this loop, all these
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On Fri, Dec 29, 2023 at 8:14 PM Rowan Tommins wrote: > - FrankenPHP expects the user to manage the main event loop, repeatedly > passing the server a function to be called once; it doesn't pass > anything into or out of the userland handler, instead resetting global > state to mimic a non-worker environment > [https://frankenphp.dev/docs/worker/#custom-apps] > This isn't exact. FrankenPHP does manage the event loop (the Go runtime manages it - through a channel - under the hood). The frankenphp_handle_request() pauses the thread until the Go runtime gives back control to the C thread (when a request is dispatched to this worker). It's actually very similar to WSGI. As I explained in my previous messages, it's expected that other SAPIs handle the event loop too (using the primitives provided by the language they are written in). > - RoadRunner doesn't use a callback at all, instead providing methods to > await a request and provide a response; it directly uses PSR-7 and > PSR-17 objects [https://roadrunner.dev/docs/php-worker/current/en] > - OpenSwoole manages the main loop itself, and uses lifecycle events to > interface to userland code; the HTTP 'Request' event is passed custom > Request and Response objects > [https://openswoole.com/docs/modules/swoole-http-server-on-request] > I already replied to Crell about that. It will totally possible to expose more complex HTTP message objects in the future, but PHP currently lacks such objects. The only things we have are superglobals (which are more or less similar to CGI variables, as done in WSGI) and streams. It's why we're using them. If PHP adds a higher-level API at some point, we'll be able to upgrade this part as every other part of the PHP code base. But it's an unrelated topic: having such higher-level representations of HTTP messages would be beneficial both in "normal" and in "worker" mode. > it would be adapted for an async PHP environment, or with WebSockets, > for instance. > I'm not sure what you mean by "async PHP environment". WebSockets and WebTransport are a different kind of beast, they are much lower level than HTTP and will require a different API anyway (and probably a lot of other adaptations in core) to be supported in PHP. In Go, for instance, the WebSocket and WebTransport APIs aren't the same as the HTTP API. Best regards,
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On Mon, Dec 25, 2023 at 7:56 PM Jordan LeDoux wrote: > > > On Mon, Dec 25, 2023 at 8:19 AM Kévin Dunglas wrote: > >> >> On Sun, Dec 24, 2023 at 10:44 PM Jordan LeDoux >> wrote: >> >>> >>> >>> On Sat, Dec 23, 2023 at 12:34 PM Kévin Dunglas wrote: >>> >>>> Hello and Merry Christmas! >>>> >>>> One of the main features of FrankenPHP is its worker mode, which lets >>>> you >>>> keep a PHP application in memory to handle multiple HTTP requests. >>>> >>>> Worker modes are becoming increasingly popular in the PHP world. Symfony >>>> (Runtime Component), Laravel (Octane), and many projects based on these >>>> frameworks (API Platform, Sulu...) now support a worker mode. >>>> >>>> In addition to FrankenPHP, projects such as RoadRunner and Swoole >>>> provide >>>> engines supporting worker modes. >>>> >>>> According to benchmarks, worker modes can improve the performance of PHP >>>> applications by up to 15 times. >>>> In addition to FrankenPHP, which is basically a SAPI for Go's integrated >>>> web server, a new generation of SAPIs is currently under development. >>>> Several SAPIs written in Rust (including one by the RoadRunner team) are >>>> currently under development. >>>> >>>> These SAPIs, along with existing SAPIs, could benefit from a shared >>>> infrastructure to build worker modes. >>>> >>>> >>>> >>>> The FrankenPHP code is written and should be easy to move around in PHP >>>> itself, to enable other SAPIs to use it. >>>> >>>> In addition to sharing code, maintenance, performance optimization, >>>> etc., >>>> the existence of a common infrastructure would standardize the way >>>> worker >>>> scripts are created and provide a high-level PHP API for writing worker >>>> scripts that work with all SAPIs that rely on this new feature. >>>> >>>> SAPIs will still have to handle fetching requests from the web server >>>> and >>>> pausing the worker to wait for new requests (in FrankenPHP, we use >>>> GoRoutines for this, in Rust or C, other primitives will have to be >>>> used), >>>> but almost everything else could be shared. >>>> >>>> For reference, here's the FrankenPHP code I propose to integrate into >>>> libphp: >>>> https://github.com/dunglas/frankenphp/blob/main/frankenphp.c#L245 >>>> >>>> The public API is documented here: >>>> https://frankenphp.dev/docs/worker/#custom-apps >>>> >>>> I'd like to hear what the community thinks about this. Would you be >>>> interested in this functionality in PHP? Should I work on an RFC? >>>> >>>> If there's interest, I can work on a patch. >>>> >>>> Cheers, >>>> -- >>>> Kévin Dunglas >>>> >>> >>> Much like Larry, I'm curious what sort of scope you imagine for this. >>> Are you imagining something that is geared specifically towards HTTP >>> requests, or would this be a more generic "PHP Application Worker" that >>> might be spawned to handle other types of applications? Could we have a >>> worker listen to a specific port and respond to or handle all requests on >>> that port/device? >>> >>> Jordan >>> >> >> Ho Jordan, >> >> Yes, the scope I imagine is geared specifically towards HTTP requests. >> Something more generic than common primitives for SAPIs and a shared public >> API to handle HTTP requests with a long-running PHP worker script will be >> hard to do outside of SAPIs because they depend on a lot of external >> concerns such as the programming language the SAPI is using. >> > > So you want to introduce a SAPI that doesn't work with any of the existing > HTTP solutions people use that only supports HTTP requests? Or am I > misunderstanding something? > > This sounds a bit like you want to merge in a tool that is designed for > your personal product directly into core. FrankenPHP may be incredible and > awesome, but the world runs on Apache and Nginx for HTTP requests. > > Jordan > As explained in the initial message and in my reply to Jakub, the main targets are emerging SAPIs. We have no interest (quite the contrary) in moving this code from FrankenPHP to PHP core (harder maintenance, slower iterations as more collaboration will be involved...), but I do think that having a "standard" and shared infrastructure and API for worker modes between new generation SAPIs will be beneficial to the community as a whole (no need - as at present - to write a different worker script for each engine having a worker mode, sharing of optimizations, security patches etc...). We're talking roughly about a C function of a few dozen lines, not something very big.
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On Mon, Dec 25, 2023 at 6:30 PM Jakub Zelenka wrote: > > > On Mon, Dec 25, 2023 at 12:34 PM Kévin Dunglas wrote: > >> On Sun, Dec 24, 2023 at 4:21 PM Larry Garfield >> wrote: >> >> In practice, I want to understand the implications for user-space code. >> > Does this mean FPM could be configured in a way to execute a file like >> that >> > shown in the docs page above? Or would it only work with third party >> SAPIs >> > like FrankenPHP? >> >> >> In theory, PHP-FPM and the Apache module could - like all other SAPIs - be >> enhanced to add a worker mode operating as described in the FrankenPHP doc >> thanks to these new primitives. >> > > I have been thinking about something similar for FPM and if you had some > sort pool manager process, you could maybe do some sort of initial > execution but then it gets really tricky especially with sharing resources > and managing connections. I think it would be a big can of worms so I don't > think this is going to happen anytime soon. I could imaging that there will > be similar issues for Apache prefork which is likely the most used MPM for > legacy apps. Effectively it means that this function won't be working on > most installations as two of the likely most used SAPI's won't support it. > I think it should be pretty clear from the beginning. > > >> However, I suggest doing this as a second step, because as described in my >> first post, it will still be the responsibility of each SAPI to manage >> long-running processes and communication with them. This is simple to do >> with Go's GoRoutine and Rust's asynchronous runtimes such as Tokio, it's >> definitely more difficult in cross-platform C. I suggest starting by >> adding >> the primitives to libphp, then we'll see how to exploit them (and whether >> it's worthwhile) in the built-in SAPIs. >> > > The problem with this is that we would add some code that won't be used by > any of the built in SAPI which means that that we won't be able to have > automated tests for this. So the minimum should be to have at least one > core SAPI supporting this new functionality. I wouldn't mind if it's just a > SAPI for testing purpose which might be actually useful for testing embed > SAPI code. I think that should be a requirement for accepting a PR > introducing this. > > It would be also good to put together some base design PR for this as > currently SAPI common functions are implemented separately in each SAPI > (e.g. apache_request_headers). From the linked functionality, it is is not > a big amount of code and seems somehow specific to the FrankenPHP so why > couldn't each SAPI just implement this function separately? I know that > this is not ideal but it's what is already used for apache_request_headers. > I think otherwise you would need some hooking mechanism that should have > some default (which would probably just throw exception) because it is not > going to be implemented by all SAPI's. I think it would be really good if > you could provide more details about planned implementation for this. > > >> I personally have less interest in working on FPM/CGI/mod_php as the other >> possibilities offered by modern SAPIs like FrankenPHP are more important >> (better deployment experience as you have a single static binary or Docker >> image, Early Hints support, high-quality native HTTP/3 server etc) >> >> > Except that those are all threaded SAPIs so they offer less separation and > protection against application crashes in addition to the fact that thread > management in PHP still has got its own issues. They are certainly some > advantages especially for thin services but if you have huge monolith > codebase like some big CMS and other projects, then I would probably stick > with process separation model. > > Cheers > > Jakub > Sure, the main targets are new SAPIs like FrankenPHP and the one in Rust developed by the RoadRunner team. I thought it was clear in my previous messages but I'll be glad to make it bold in the RFC. Automated tests (likely through a test SAPI) will definitely be needed. Throwing if the current SAPI doesn't support (yet) the new userland function looks sensitive. Couldn't this shared code be put in "main", as it could (theoretically, I agree that it will be hard to do for existing core SAPIs) be used by all SAPIs?
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
> Forgive my ignorance, but why no connection? You mean the > pre-worker-start part needs to avoid an SQL connection? Why is that? That > would be something that needs to be super-well documented, and possibly > some guards in place to prevent it, if there's no good way around it. > (This is the sort of detail I'm thinking of, where I just don't know the > implications but want to think through them as much as possible in advance, > so that it can be "safe by design.") > Sorry, I made a typo. I mean "libraries must ensure that the connection **is** active" (if the connection timeout has been reached, the library must reconnect). Your worker script will be long-running code, as in Java, Go, etc. So if it depends on external services, it must check that the connection is still active, and reconnect if necessary. This is the default in most languages, but not in PHP (yet). Do you have an intent or expectation of a worker-style SAPI being shipped > with PHP itself, or for that to remain the domain of third parties? > As I tried to explain in my previous message, this could be nice, and possible, but I don't plan to do it myself for now :) > I mean more what implications would there be on how user-space code is > written to be worker-SAPI-friendly. (The SQL connection comment above, for > example.) I have not worked with any of the worker-ish tools so far > myself, so other than "you'll need an alternate index.php for that", I > don't have a good sense of what else I'd want to do differently to play > nice with Franken and Friends. > As far as I know, there are no other implications than memory (and other resources) leaks (https://laravel.com/docs/10.x/octane#managing-memory-leaks) and timeout handling. > The idea of combining fiber-based code with supported worker-mode runners > sounds like a ridiculously cool future for PHP, but I don't know how windy > that path is. :-) > That already works if you use FrankenPHP! Joe also experimented successfully using the parallel extension instead of Fibers: https://twitter.com/krakjoe/status/1587234661696245760
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On Sun, Dec 24, 2023 at 10:44 PM Jordan LeDoux wrote: > > > On Sat, Dec 23, 2023 at 12:34 PM Kévin Dunglas wrote: > >> Hello and Merry Christmas! >> >> One of the main features of FrankenPHP is its worker mode, which lets you >> keep a PHP application in memory to handle multiple HTTP requests. >> >> Worker modes are becoming increasingly popular in the PHP world. Symfony >> (Runtime Component), Laravel (Octane), and many projects based on these >> frameworks (API Platform, Sulu...) now support a worker mode. >> >> In addition to FrankenPHP, projects such as RoadRunner and Swoole provide >> engines supporting worker modes. >> >> According to benchmarks, worker modes can improve the performance of PHP >> applications by up to 15 times. >> In addition to FrankenPHP, which is basically a SAPI for Go's integrated >> web server, a new generation of SAPIs is currently under development. >> Several SAPIs written in Rust (including one by the RoadRunner team) are >> currently under development. >> >> These SAPIs, along with existing SAPIs, could benefit from a shared >> infrastructure to build worker modes. >> >> >> >> The FrankenPHP code is written and should be easy to move around in PHP >> itself, to enable other SAPIs to use it. >> >> In addition to sharing code, maintenance, performance optimization, etc., >> the existence of a common infrastructure would standardize the way worker >> scripts are created and provide a high-level PHP API for writing worker >> scripts that work with all SAPIs that rely on this new feature. >> >> SAPIs will still have to handle fetching requests from the web server and >> pausing the worker to wait for new requests (in FrankenPHP, we use >> GoRoutines for this, in Rust or C, other primitives will have to be used), >> but almost everything else could be shared. >> >> For reference, here's the FrankenPHP code I propose to integrate into >> libphp: https://github.com/dunglas/frankenphp/blob/main/frankenphp.c#L245 >> >> The public API is documented here: >> https://frankenphp.dev/docs/worker/#custom-apps >> >> I'd like to hear what the community thinks about this. Would you be >> interested in this functionality in PHP? Should I work on an RFC? >> >> If there's interest, I can work on a patch. >> >> Cheers, >> -- >> Kévin Dunglas >> > > Much like Larry, I'm curious what sort of scope you imagine for this. Are > you imagining something that is geared specifically towards HTTP requests, > or would this be a more generic "PHP Application Worker" that might be > spawned to handle other types of applications? Could we have a worker > listen to a specific port and respond to or handle all requests on that > port/device? > > Jordan > Ho Jordan, Yes, the scope I imagine is geared specifically towards HTTP requests. Something more generic than common primitives for SAPIs and a shared public API to handle HTTP requests with a long-running PHP worker script will be hard to do outside of SAPIs because they depend on a lot of external concerns such as the programming language the SAPI is using.
Re: [PHP-DEV] RFC proposal: worker mode primitives for SAPIs
On Sun, Dec 24, 2023 at 4:21 PM Larry Garfield wrote: In practice, I want to understand the implications for user-space code. > Does this mean FPM could be configured in a way to execute a file like that > shown in the docs page above? Or would it only work with third party SAPIs > like FrankenPHP? In theory, PHP-FPM and the Apache module could - like all other SAPIs - be enhanced to add a worker mode operating as described in the FrankenPHP doc thanks to these new primitives. However, I suggest doing this as a second step, because as described in my first post, it will still be the responsibility of each SAPI to manage long-running processes and communication with them. This is simple to do with Go's GoRoutine and Rust's asynchronous runtimes such as Tokio, it's definitely more difficult in cross-platform C. I suggest starting by adding the primitives to libphp, then we'll see how to exploit them (and whether it's worthwhile) in the built-in SAPIs. I personally have less interest in working on FPM/CGI/mod_php as the other possibilities offered by modern SAPIs like FrankenPHP are more important (better deployment experience as you have a single static binary or Docker image, Early Hints support, high-quality native HTTP/3 server etc), but I'd be happy to help if anyone wants to update these SAPIs. I assume the handler function would be differently named. I suggest naming the function handle_request() or something similar and using the same name for all SAPIs, so the same worker script will work everywhere. I'll update FrankenPHP to use the "standard" name. > Is passing in super-globals the right/best way to handle each request, or > would it be sensible to have some other abstraction there? (Whether a > formal request object a la PSR-7 or something else.) Passing super-globals is at the same time the most interoperable solution (it allows using almost all existing PHP libraries in worker mode without any change to them), and also allows to reuse of the existing C code. Transforming super-globals in HttpFoundation, PSR-7, or other objects is straightforward and can entirely be done userland (it's already what the Symfony Runtime Component and Laravel Octane do), so there is no need to "bloat" the C code. Having more high-level data structures to manipulate HTTP messages similar to HttpFoundation or PSR-7 in the language could be nice (and is in my opinion needed), but is a separate topic. If PHP adds a new abstraction for that at some point, it will be easy to add support for them both in standard and worker mode. > To what extent would user-space code run this way have to think about > concurrency, shared memory, persistent SQL connections, etc? Does it have > any implications for fiber-using async code? > Regarding concurrency, it doesn't change much (it's similar to existing SAPI). Regarding memory and SQL connections, extra care is required. Memory leaks (and other kinds of leaks) should be avoided (or workers should restart from time to time, which is obviously a poorer solution). Libraries maintaining SQL connections such as Doctrine or Eloquent must ensure that the connection isn't active. The good news is that thanks to RoadRunner, Swoole, Laravel Octane, Symfony Runtime etc... Most popular libraries are already compatible with long-running processes, and most issues have been fixed. Some old apps and libraries will probably never be updatable, but that's not a big issue because this feature will be entirely opt-in. Fibers work as expected. There is a small limitation when using them with Go (that is being tracked in the Go runtime, https://frankenphp.dev/docs/known-issues/#fibers), but it's not related to the C code of the worker mode, and this limitation shouldn't exist for SAPIs not written in Go. > Depending on the details, this could be like fibers but for 3rd party > SAPIs (something about 4 people in the world actually care about directly, > everyone else just uses Revolt, Amp, or React, but mostly it doesn't get > used), or completely changing the way 90% of the market runs PHP, which > means frameworks will likely adapt to use that model primarily or > exclusively (ie, less of a need for a "compile" step as a generated > container or dispatcher is just held in memory automatically already). The > latter sounds exciting to me, but I'm not sure which is your intent, so I > don't know if I'm going too far with it. :-) > My intent is that most SAPIs expose the same (or a very similar interoperable) worker mode. So (I hope) that most PHP developers will not have to deal with these primitives directly, but that it will allow a new generation of super-fast PHP apps to be created. Most frameworks already support that but require a lot of boilerplate code to support the different existing engines. Standardizing will likely increase adoption and will allow collaboration to make the low-level code that I propose to move in libphp as fast, stable, and clean as possible. >
[PHP-DEV] RFC proposal: worker mode primitives for SAPIs
Hello and Merry Christmas! One of the main features of FrankenPHP is its worker mode, which lets you keep a PHP application in memory to handle multiple HTTP requests. Worker modes are becoming increasingly popular in the PHP world. Symfony (Runtime Component), Laravel (Octane), and many projects based on these frameworks (API Platform, Sulu...) now support a worker mode. In addition to FrankenPHP, projects such as RoadRunner and Swoole provide engines supporting worker modes. According to benchmarks, worker modes can improve the performance of PHP applications by up to 15 times. In addition to FrankenPHP, which is basically a SAPI for Go's integrated web server, a new generation of SAPIs is currently under development. Several SAPIs written in Rust (including one by the RoadRunner team) are currently under development. These SAPIs, along with existing SAPIs, could benefit from a shared infrastructure to build worker modes. The FrankenPHP code is written and should be easy to move around in PHP itself, to enable other SAPIs to use it. In addition to sharing code, maintenance, performance optimization, etc., the existence of a common infrastructure would standardize the way worker scripts are created and provide a high-level PHP API for writing worker scripts that work with all SAPIs that rely on this new feature. SAPIs will still have to handle fetching requests from the web server and pausing the worker to wait for new requests (in FrankenPHP, we use GoRoutines for this, in Rust or C, other primitives will have to be used), but almost everything else could be shared. For reference, here's the FrankenPHP code I propose to integrate into libphp: https://github.com/dunglas/frankenphp/blob/main/frankenphp.c#L245 The public API is documented here: https://frankenphp.dev/docs/worker/#custom-apps I'd like to hear what the community thinks about this. Would you be interested in this functionality in PHP? Should I work on an RFC? If there's interest, I can work on a patch. Cheers, -- Kévin Dunglas
Re: [PHP-DEV] Set register_argc_argv to Off by default
This change seems reasonable to me: safer, with little chance of breaking things, and easy to reverse for the end user by changing a single parameter.
[PHP-DEV] Bad interactions between Fibers and GoRoutines (and/or cgo)
Hi there, We are experiencing strange problems with Fibers when running PHP with FrankenPHP. Fibers sometimes interact badly with the Go runtime on Linux x66 or amd64 (especially in Docker containers) and lead to crashes. We've tried many things: compiling with --disable-fiber-asm, compiling with -fsplit-stack, and increasing the system stack size limit but crashes always occur. This looks related to how Go and Fibers manipulate the stack. Here is a detailed bug report: https://github.com/golang/go/issues/62130 And the reproducer: https://github.com/dunglas/frankenphp/pull/171 Does anyone have any idea what's going on? Best regards, -- Kévin Dunglas
[PHP-DEV] Proposal to incrementally improve timeout and signal handling
Hello Internals, PHP suffers from several issues related to timeout and signal handling, especially when built with ZTS enabled. 1. The current implementation of timeouts on UNIX builds seems "fundamentally incompatible with ZTS" ( https://bugs.php.net/bug.php?id=79464#1589205685) and more anecdotally conflicts with some Go features ( https://github.com/golang/go/issues/56260#issuecomment-1281040802) 2. "Zend Signals" causes segmentation faults and other problems in multi-threaded environments ( https://github.com/php/php-src/issues/9649#issuecomment-1264330874, https://github.com/php/php-src/pull/5591#issuecomment-650064098), and seems useless anyway since PHP 7.1 ( https://github.com/php/php-src/pull/5591#issuecomment-645428002) In 2020, Alex Dowad started a major refactoring to improve these parts ( https://github.com/php/php-src/pull/5570, https://github.com/php/php-src/pull/5591, https://github.com/php/php-src/pull/5710), but he stopped working on it. Instead of doing a major one-time refactoring like that, I propose moving forward little by little to limit the risks and the potential backward compatibility breaks. Here is the plan: 1. Switch to timer_create() for timeouts, but only on Linux (it's not supported on other platforms yet), when ZTS is enabled and Zend Signals is disabled. As the feature is currently entirely broken when ZTS is on, this can be considered a bug fix and cannot be worse than it currently is. 1bis. Can be done independently of 1., and even in parallel, optional as long as we keep the --disable-zend-signals flag: Remove Zend Signals entirely, because even if it can be partially fixed (I proposed a patch fixing segfaults: https://github.com/php/php-src/pull/9766), it seems now useless and causes unfixable issues with some signals such as SIGINT ( https://github.com/php/php-src/issues/9649#issuecomment-1265811930) 2. Switch to Grand Central Dispatch on macOS and FreeBSD when ZTS is enabled and Zend Signals is disabled (if not removed at this point), which provides a feature similar to timer_create() for these platforms. 3. Probably in a future major version, optional: switch to timer_create()/GCD even for non-ZTS builds to uniformize and simplify the code. What do you think about this plan? Apart from the technical aspects, what's the best way forward? Submit patches? Propose an RFC? Do both? (pardon my ignorance of internals processes). Thank you, -- Kévin Dunglas
[PHP-DEV] Set SA_ONSTACK in zend_sigaction External
Hi, internals! It's been a while. I'm currently working on a new SAPI for web servers written in Go. Many virtual machines, including Go ( https://pkg.go.dev/os/signal#hdr-Go_programs_that_use_cgo_or_SWIG), depend on signals using SA_ONSTACK ( https://man7.org/linux/man-pages/man2/sigaltstack.2.html). This flag allows a thread to define a new alternate signal stack. Many argue that SA_ONSTACK should be a default, but it's not the case (yet). Python merged a patch setting SA_ONSTACK in 2021 (Python 3.10+) for the same reasons (https://bugs.python.org/issue43390 / https://github.com/python/cpython/commit/02ac6f41e5569ec28d625bb005155903f64cc9ee), with no issues. I opened a Pull Request to set this flag by default and tested it successfully with my Go SAPI: https://github.com/php/php-src/pull/9597 As this is technically at the limit between a new feature and a bug fix (having the ability to call Go/C++ VM code from PHP and embed PHP in such programs), should I open an RFC? Also, if merging my patch is considered, which branch should I target? Cheers, -- Kévin Dunglas
[PHP-DEV] Add support for ::class to constant()
Hi folks, Currently, it's not possible to use the ::class special constant with the constant() function. This doesn't work: var_dump( constant('\DateTime::class') ); For instance, Twig's constant() helper internally uses this PHP function, consequently the following Twig template doesn't work: `myObject` contains a random object, retrieve its class: {{ constant('class', myObject) }} I wrote a patch adding support for ::class: https://github.com/php/php-src/pull/6763 As this probably qualifies as a new feature, should I write an RFC too? Cheers,
[PHP-DEV] Re: hash_equals: leak less information about length
Hello internals, I've submitted this PR a long time ago: https://github.com/php/php-src/pull/792 I still think it's a good idea to mitigate the length leak (rather than returning immediately if strings are not of the same length) while advertising in docs that the length will leak in any case. php.net doc has been fixed, but - for instance - this is not the case of the Symfony doc: http://symfony.com/doc/current/components/security/secure_tools.html (this method internally use hash_equals, I've just submitted a PR to fix this doc but I'm sure there is a lot of other misuses in the wild). To summarize: a theoretical (especially for web apps, more annoying for CLI apps) and advertised leak is better than a big undocumented leak. Can you merge this PR? 2014-08-31 12:59 GMT+02:00 Kévin Dunglas dung...@gmail.com: Hi, I've submitted a PR to make the hash_equals function leak less information about compared strings' lengths (benchmark and use cases available in comments): https://github.com/php/php-src/pull/792 Trying to hide length is needed to replace Symfony and Joomla PHP implementations by hash_equals (when available). The idea: - clearly advert in the documentation that this function can potentially leak lengths - Try to make it harder for an attacker by using a robuster implementation. If there there is an agreement to use this kind of implementation, I'll rework the PR to use some tricks from the CPython one ( https://github.com/python/cpython/blob/c7688b44387d116522ff53c0927169db45969f0e/Modules/_operator.c#L175 - use of volatile and no modulo). Best regards, -- Kévin Dunglas http://dunglas.fr -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
[PHP-DEV] Fixed Bug #65576 (Constructor from trait conflicts with inherited constructor)
Hi, I've published a patch for bug #65576 : https://github.com/php/php-src/pull/946 Can you review it and merge it please ? Best regards. -- Kévin Dunglas http://dunglas.fr http://les-tilleuls.coop
Re: [PHP-DEV] Fixed Bug #65576 (Constructor from trait conflicts with inherited constructor)
I've just implemented what it's described in the linked bug (not my report but my team has the same issue): https://bugs.php.net/bug.php?id=65576 The rationale is: it works the same way for all magic methods except for the constructor and it seems to be a regression introduced in the fix of another bug (see comments in the bug tracker). 2014-12-08 16:17 GMT+01:00 Levi Morrison le...@php.net: I've published a patch for bug #65576 : https://github.com/php/php-src/pull/946 Can you review it and merge it please ? Are we sure that's that correct behavior? Can you provide some rationale for why it should happen this way? -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
Re: [PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL
I've just pushed some changes in the PR. FILTER_VALIDATE_DOMAIN now checks characters validity only if FILTER_FLAG_HOSTNAME is set. I've also rebased and fixed some issues detailed on GitHub. Yasuo, it's not trivial to use this new validator in FILTER_VALIDATE_EMAIL. Its current implementation use a big regex that doesn't extract the domain part. Anyway, having a good RFC compliant email validator cannot be done with a regex. See https://github.com/egulias/EmailValidator for instance. I think it's a work for another PR. I'll keep the email validator in it's current state for now. Do you guys are OK to get the current PR merged? 2014-11-12 19:10 GMT+01:00 Kévin Dunglas dung...@gmail.com: Hi Yasuo, I've not changed (and even read) the email validator. I'll take a look at it. 2014-11-12 10:41 GMT+01:00 Yasuo Ohgaki yohg...@ohgaki.net: Hi Kevin, On Wed, Nov 12, 2014 at 4:09 PM, Kévin Dunglas dung...@gmail.com wrote: I'll change my PR according to the RFC I've quoted earlier: - check for valid characters (excluding underscore) only when FILTER_FLAG_HOSTNAME is set - allow any character but check lengths by default - use FILTER_FLAG_HOSTNAME to validate URLs What do you think about that? I haven't read diff closely, but it seems ok to me. How email domain is checked? I cannot see changes for it from the diff. Validating host correctly is difficult, I would like to have your PR. Regards, -- Yasuo Ohgaki yohg...@ohgaki.net -- Kévin Dunglas http://dunglas.fr -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
[PHP-DEV] Re: IDN support in streams
Hi, Can a wiki admin give me RFC creation right? My wiki username is: dunglas I'll submit an RFC for IDN support. It will require adding ICU as a core dependency (previous discussion here http://marc.info/?l=php-internalsm=141107203812897 and on GitHub). Thanks! 2014-11-05 8:34 GMT+01:00 Kévin Dunglas dung...@gmail.com: Hello, I've submitted a PR to add IDN support in PHP streams. The way it's done will allow easy IDN domain validation in ext/filter too. Can you review this PR please? https://github.com/php/php-src/pull/890 -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20 -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
Re: [PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL
Hi Yasuo, I've not changed (and even read) the email validator. I'll take a look at it. 2014-11-12 10:41 GMT+01:00 Yasuo Ohgaki yohg...@ohgaki.net: Hi Kevin, On Wed, Nov 12, 2014 at 4:09 PM, Kévin Dunglas dung...@gmail.com wrote: I'll change my PR according to the RFC I've quoted earlier: - check for valid characters (excluding underscore) only when FILTER_FLAG_HOSTNAME is set - allow any character but check lengths by default - use FILTER_FLAG_HOSTNAME to validate URLs What do you think about that? I haven't read diff closely, but it seems ok to me. How email domain is checked? I cannot see changes for it from the diff. Validating host correctly is difficult, I would like to have your PR. Regards, -- Yasuo Ohgaki yohg...@ohgaki.net -- Kévin Dunglas http://dunglas.fr
Re: [PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL
Hi, I'll change my PR according to the RFC I've quoted earlier: - check for valid characters (excluding underscore) only when FILTER_FLAG_HOSTNAME is set - allow any character but check lengths by default - use FILTER_FLAG_HOSTNAME to validate URLs What do you think about that? Best regards, 2014-11-12 7:38 GMT+01:00 Yasuo Ohgaki yohg...@ohgaki.net: Hi all, On Fri, Nov 7, 2014 at 6:48 AM, Sanford Whiteman figureone...@gmail.com wrote: FWIW, there *is* a practical in-use (de facto if nothing else) convention of using _ in hosts for DKIM: _domainkey is actually in all the DKIM RFCs and in the formal STD 76, see § 3.6.2.1. Namespace, so it's more than a convention! _ is used for service name. Active Directory uses _ a lot, for example. e.g. _tcp, _sites, _ldap, etc. https://tools.ietf.org/html/rfc2782 Regards, -- Yasuo Ohgaki yohg...@ohgaki.net -- Kévin Dunglas http://dunglas.fr
Re: [PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL
FILTER_VALIDATE_DOMAIN checks conformance with DNS RFCs : total length, label length and allowed characters (_ are allowed in domain names but many other characters are forbidden such as ~/+...). I'll add IDN support too when IDN support for streams will be merged. FILTER_VALIDATE_URL checks conformance with URL RFCs (and not URI, as discussed on GitHub). URL's host part RFCs conformance implies DNS RFCs conformance, IPv4 and IPv6 RFCs conformance + some additional checks (no underscore allowed in hostnames and IPv6 enclosed with brackets for instance). It's why I've added the convenience flag FILTER_FLAG_HOSTNAME. Btw, there is many use case for validating that a string is a valid domain (or a valid hostname): hoster and registar apps, mail server management apps and anything else DNS related. Maybe be can we find a better name for FILTER_VALIDATE_DOMAIN. Such as FILTER_VALIDATE_DOMAIN_NAME or FILTER_VALIDATE_DNS_DOMAIN (a bit redundant, DNS = Domain Name System) but please not something related DNS Record because a valid DNS record can have the following format: les-tilleuls.coop. 3600 IN SOA monsite.nnx.com .root.monsite.nnx.com. ( 2014092300 ; serial 21600 ; refresh (6 hours) 3600 ; retry (1 hour) 604800 ; expire (1 week) 86400 ; minimum (1 day) ) 2014-11-06 13:55 GMT+01:00 Andrey Andreev n...@devilix.net: Hi, On Thu, Nov 6, 2014 at 8:19 AM, Kévin Dunglas dung...@gmail.com wrote: Hi Andrey, Sorry but I think you're wrong. Domain != hostname. Underscore are allowed in domains (RFC 2181) but not in hostnames (RFC 1123 and next). To quote Wikipedia: While a hostname may not contain other characters, such as the underscore character (_), other DNS names may contain the underscore. Systems such asDomainKeys and service records use the underscore as a means to assure that their special character is not confused with hostnames. For example,_http._sctp.www.example.com specifies a service pointer for an SCTP capable webserver host (www) in the domain example.com. http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names You can also see this StackOverflow answer http://stackoverflow.com/a/2183140/1352334 I agree to an extent, but that is highly contextual. Who said that 'domain' === 'DNS record' (which is a very broad term anyway)? And IF we assume this, why do you need FILTER_VALIDATE_DOMAIN for it if it's only going to check length? Cheers, Andrey. -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
Re: [PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL
Hi, According to the discussion on GitHub, I've made some changes on this PR: - Added a new FILTER_VALIDATE_DOMAIN filter validating domain names - Added a FILTER_FLAG_HOSTNAME flag to allow checking hostnames (_ are forbidden in hostname but not in domains) - Changed FILTER_VALIDATE_URL to use this new validator When https://github.com/php/php-src/pull/890 will be merged, it will be easy to add IDN support to this new domain validator. 2014-10-14 13:48 GMT+02:00 Daniel Ribeiro drgom...@gmail.com: Nice work man, it looks really good. Daniel Ribeiro http://danielribeiro.org On Tue, Oct 14, 2014 at 3:41 PM, Kévin Dunglas dung...@gmail.com wrote: Hi, I opened a PR making FILTER_VALIDATE_URL more strict and more compliant with standards: https://github.com/php/php-src/pull/826 Can anyone review (and merge) this patch? Thanks! -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20 -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
Re: [PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL
Hi Andrey, Sorry but I think you're wrong. Domain != hostname. Underscore are allowed in domains (RFC 2181) but not in hostnames (RFC 1123 and next). To quote Wikipedia: While a hostname may not contain other characters, such as the underscore character (_), other DNS names may contain the underscore. Systems such asDomainKeys and service records use the underscore as a means to assure that their special character is not confused with hostnames. For example,_http._sctp.www.example.com specifies a service pointer for an SCTP capable webserver host (www) in the domain example.com. http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names You can also see this StackOverflow answer http://stackoverflow.com/a/2183140/1352334 2014-11-06 0:32 GMT+01:00 Andrey Andreev n...@devilix.net: Hi, On Wed, Nov 5, 2014 at 11:57 PM, Kévin Dunglas dung...@gmail.com wrote: - Added a new FILTER_VALIDATE_DOMAIN filter validating domain names - Added a FILTER_FLAG_HOSTNAME flag to allow checking hostnames (_ are forbidden in hostname but not in domains) This doesn't make any sense. A domain *is* a hostname and underscores are forbidden. Cheers, Andrey. -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
[PHP-DEV] IDN support in streams
Hello, I've submitted a PR to add IDN support in PHP streams. The way it's done will allow easy IDN domain validation in ext/filter too. Can you review this PR please? https://github.com/php/php-src/pull/890 -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
Re: [PHP-DEV] Internationalized Domain Name support in FILTER_VALIDATE_URL
Hi Chris, I've just blogged about IDN support in PHP. This post include a (tiny) userland implementation of streams: http://dunglas.fr/2014/10/internationalized-domain-name-idn-and-php/ What do you think about the following to add native support : 1. As already stated, make ICU a dependency of core 2. Convert the host returned by php_parse_url here https://github.com/php/php-src/blob/master/ext/standard/http_fopen_wrapper.c#L154 to Punycode with http://icu-project.org/apiref/icu4c432/uidna_8h.html#a711fa1d2e6dd25d7368f5b3ea2aaedc6 It looks not so intrusive and relatively easy to implement. According to RFC I quote in the blog post, it should work with SSL too. I can make a PR (or a RFC if needed) with this method if it seems applicable. Best regards, 2014-09-24 8:33 GMT+02:00 Pierre Joye pierre@gmail.com: On Wed, Sep 24, 2014 at 2:48 AM, Stas Malyshev smalys...@sugarcrm.com wrote: Hi! I'll implement optional (and not default) support of IDN in filter_var(). Does anyone known if it's better to use libIDN (LGPL) or ICU (custom license deviated from the X license) from a license point of view? ICU is definitely better since we already have a lot of code using ICU and AFAIK our current IDN functions (idn_to_*) use ICU. Which means it would be advantageous to keep it in the single library - whatever bugs there may be, at least the user will be dealing with one set of bugs instead of two :) Indeed :) However I am not sure yet we should do it, or at least not by default. It may introduce side effects or BC issues.While IDN is bi-directional or could be called many times and returning the same result, we have to be careful to do not break things out there, for example someone relying on it to process URI/URL. Cheers, -- Pierre @pierrejoye | http://www.libgd.org -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
[PHP-DEV] Better RFC conformance for FILTER_VALIDATE_URL
Hi, I opened a PR making FILTER_VALIDATE_URL more strict and more compliant with standards: https://github.com/php/php-src/pull/826 Can anyone review (and merge) this patch? Thanks! -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
Re: [PHP-DEV] #68049 filter_var echo wrong result for a url
Some browsers do. Some versions of IE are buggy when the URL include underscores: http://stackoverflow.com/questions/794243/internet-explorer-ignores-cookies-on-some-domains-cannot-read-or-set-cookies I think that filter_var must follow the RFC by default. Maybe can we add a flag to allow malformed URL in use in the wild? 2014-09-21 10:42 GMT+02:00 Florian Margaine flor...@margaine.com: Hi, According to https://bugs.php.net/bug.php?id=51192 , valid URLs cannot contain underscores. The following bug was reported a couple days ago: https://bugs.php.net/bug.php?id=68049 The thing is, browsers *do* accept the underscore in URLs. Should the rfc3986 http://tools.ietf.org/html/rfc3986#section-3.2.2 be respected, or should PHP be lenient like browsers and accept more? Regards, *Florian Margaine* -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
Re: [PHP-DEV] Internationalized Domain Name support in FILTER_VALIDATE_URL
I'll implement optional (and not default) support of IDN in filter_var(). Does anyone known if it's better to use libIDN (LGPL) or ICU (custom license deviated from the X license) from a license point of view? 2014-09-19 16:18 GMT+02:00 Chris Wright c...@daverandom.com: On 19 September 2014 14:48, Kévin Dunglas dung...@gmail.com wrote: Support of IDN in streams is a must have. But there is a lot of other use cases for URL with IDN validation. The most common is probably form validation (test if an user submitted URL has a valid format and can be used to create an HTML link...). I'm ok making IDN validation optional and not used by default until PHP natively support IDN in other features such as streams. But IDN are used more and more in the wild, and from a user point of view it is disappointing that a valid URL, working in browsers and even displayed by Google Search is not considered as a valid URL by a PHP-based website using filter_var() without a specific flag. Even some TLD are using non-ASCII characters, exemple: http://旅游气象.中国 http://xn--zfv73l7xbp87c.xn--fiqs8s (popular Chinese weather site). About the library, I've not preference between libidn and icu. If the licence is libidn fit better with the PHP one, libidn is probably the better choice. Having a PHP specific implementation of STRINGPREP and Punnycode sounds not like a good idea (reinventing the wheel, more code to maintain). Chris, is there a chance to have your work on streams merged in PHP 7? It's very hacky and PoC at the moment. I've got a bunch of time-consuming personal things going on right now, but within the next couple of weeks I will try and polish it up into something serviceable, maintainable and tested/less likely to explode with edge-cases and then I'll put it up for discussion. I'm also fine if someone else wants to have a crack in the meantime, I can push my work so far to github early next week when I get access to the machine. I'd certainly like the functionality to be in 7 if it's viable from a licensing and dependency PoV - I had been holding off bringing it up to see what happened with the more general unicode support discussion (which I somewhat lost track of and seems to have died out) as there was talk of introducing a hard dependency on ICU-or-similar at one point, which would have made this a no-brainer. What do you thing about the following planning: - 5.7 (if exists): add IDN support in filter disabled by default. Use libidn if selected to be used for streams too. - 7 (if IDN support for streams is completed): validate IDN by default (what the user expect), add a flag to disable IDN validation. Of course we'll update the doc explaining the new behavior. 2014-09-19 12:28 GMT+02:00 Chris Wright c...@daverandom.com: On 19 September 2014 10:58, Pierre Joye pierre@gmail.com wrote: Hi, On Sep 19, 2014 4:03 PM, Chris Wright c...@daverandom.com wrote: Kévin On 18 September 2014 21:26, Kévin Dunglas dung...@gmail.com wrote: Hello, I'm working on enhancing the FILTER_VALIDATE_URL filter ( https://github.com/php/php-src/pull/826). The current implementation does not support validation of internationalized domain names (i.e: http://www.académie-française.fr/ http://www.xn--acadmie-franaise-npb1a.fr/ http://www.xn--acadmie-franaise-npb1a.fr/). Support of IDN validation can be easily added using ICU's uidna_toASCII() function. Is it acceptable to add a dependency to ICU for ext/filter? Another option is to add a HAVE_ICU constant in main/php_config.h and to validate IDN only if ICU is present. What strategy is preferred? I've done some work around this area previously, and all I will say is: be careful with what you do with this from a userland PoV. PHP does not natively support IDN in stream open routines or SSL verification routines. It will never support these things without at least one of: - a core dependency on ICU, libidn or similar - moving streams into an extension so a dependency can be introduced there (probably not sanely possible) - an in-house NAMEPREP implementation (this is the hard part of IDN, punycode itself is pretty trivial to implement once you have a canonical set of codepoints) These things can be implemented with *a lot* of boilerplate in userland when you have ext/intl, but it's not pretty. libcurl *can* support IDN if it was built against libidn, I'm not sure if this is currently the case in common distributions or not. Since one almost never just validates a URL string, it's usually a precursor to attempting to open it, this could lead to some pretty hefty wtfs. In short, while I'm generally for ext/filter being able to handle IDN, I *do not* believe it should do it implicitly, it should require an explicit flag, because
Re: [PHP-DEV] #68049 filter_var echo wrong result for a url
I've recently proposed a refactoring of FILTER_VALIDATE_URL: https://github.com/php/php-src/pull/826 I can easily add the support of this new flag is everyone agree. 2014-09-22 9:09 GMT+02:00 Florian Margaine flor...@margaine.com: Oh, IE. *sigh* Adding a new flag sounds like a good idea indeed, `FILTER_VALIDATE_UNCOMPLIANT_URL` sounds good enough? I guess it should accept underscores and domain names starting with numbers too. Regards, *Florian Margaine* P.S: sorry Kevin for the double mail. Le 22 sept. 2014 09:03, Kévin Dunglas dung...@gmail.com a écrit : Some browsers do. Some versions of IE are buggy when the URL include underscores: http://stackoverflow.com/questions/794243/internet-explorer-ignores-cookies-on-some-domains-cannot-read-or-set-cookies I think that filter_var must follow the RFC by default. Maybe can we add a flag to allow malformed URL in use in the wild? 2014-09-21 10:42 GMT+02:00 Florian Margaine flor...@margaine.com: Hi, According to https://bugs.php.net/bug.php?id=51192 , valid URLs cannot contain underscores. The following bug was reported a couple days ago: https://bugs.php.net/bug.php?id=68049 The thing is, browsers *do* accept the underscore in URLs. Should the rfc3986 http://tools.ietf.org/html/rfc3986#section-3.2.2 be respected, or should PHP be lenient like browsers and accept more? Regards, *Florian Margaine* -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20 -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
Re: [PHP-DEV] Internationalized Domain Name support in FILTER_VALIDATE_URL
Support of IDN in streams is a must have. But there is a lot of other use cases for URL with IDN validation. The most common is probably form validation (test if an user submitted URL has a valid format and can be used to create an HTML link...). I'm ok making IDN validation optional and not used by default until PHP natively support IDN in other features such as streams. But IDN are used more and more in the wild, and from a user point of view it is disappointing that a valid URL, working in browsers and even displayed by Google Search is not considered as a valid URL by a PHP-based website using filter_var() without a specific flag. Even some TLD are using non-ASCII characters, exemple: http://旅游气象.中国 http://xn--zfv73l7xbp87c.xn--fiqs8s (popular Chinese weather site). About the library, I've not preference between libidn and icu. If the licence is libidn fit better with the PHP one, libidn is probably the better choice. Having a PHP specific implementation of STRINGPREP and Punnycode sounds not like a good idea (reinventing the wheel, more code to maintain). Chris, is there a chance to have your work on streams merged in PHP 7? What do you thing about the following planning: - 5.7 (if exists): add IDN support in filter disabled by default. Use libidn if selected to be used for streams too. - 7 (if IDN support for streams is completed): validate IDN by default (what the user expect), add a flag to disable IDN validation. Of course we'll update the doc explaining the new behavior. 2014-09-19 12:28 GMT+02:00 Chris Wright c...@daverandom.com: On 19 September 2014 10:58, Pierre Joye pierre@gmail.com wrote: Hi, On Sep 19, 2014 4:03 PM, Chris Wright c...@daverandom.com wrote: Kévin On 18 September 2014 21:26, Kévin Dunglas dung...@gmail.com wrote: Hello, I'm working on enhancing the FILTER_VALIDATE_URL filter ( https://github.com/php/php-src/pull/826). The current implementation does not support validation of internationalized domain names (i.e: http://www.académie-française.fr/ http://www.xn--acadmie-franaise-npb1a.fr/ http://www.xn--acadmie-franaise-npb1a.fr/). Support of IDN validation can be easily added using ICU's uidna_toASCII() function. Is it acceptable to add a dependency to ICU for ext/filter? Another option is to add a HAVE_ICU constant in main/php_config.h and to validate IDN only if ICU is present. What strategy is preferred? I've done some work around this area previously, and all I will say is: be careful with what you do with this from a userland PoV. PHP does not natively support IDN in stream open routines or SSL verification routines. It will never support these things without at least one of: - a core dependency on ICU, libidn or similar - moving streams into an extension so a dependency can be introduced there (probably not sanely possible) - an in-house NAMEPREP implementation (this is the hard part of IDN, punycode itself is pretty trivial to implement once you have a canonical set of codepoints) These things can be implemented with *a lot* of boilerplate in userland when you have ext/intl, but it's not pretty. libcurl *can* support IDN if it was built against libidn, I'm not sure if this is currently the case in common distributions or not. Since one almost never just validates a URL string, it's usually a precursor to attempting to open it, this could lead to some pretty hefty wtfs. In short, while I'm generally for ext/filter being able to handle IDN, I *do not* believe it should do it implicitly, it should require an explicit flag, because it will break *a lot* of code if IDN is suddenly treated as valid where it previously wasn't. I am really not sure about that especially the enabling by default part. The doc is pretty clear about what this filter supports and allowing idn may break a lot of codes out there. From an implementation point of view we may not need ICU to support IDN. Windows does not use it and there are license friendly decoder implementations too. If we can agree on adding a core dependency on some IDN support lib, I already have an experimental local branch that adds full IDN support to streams. It's based on libidn but it would be easy enough to swap it out for something else that provides the same functionality. In my (biased) opinion, streams are a far more important element of IDN support. Filter validation is just polish/a nicety on top. -- Kévin Dunglas http://dunglas.fr
[PHP-DEV] Internationalized Domain Name support in FILTER_VALIDATE_URL
Hello, I'm working on enhancing the FILTER_VALIDATE_URL filter ( https://github.com/php/php-src/pull/826). The current implementation does not support validation of internationalized domain names (i.e: http://www.académie-française.fr/ http://www.xn--acadmie-franaise-npb1a.fr/). Support of IDN validation can be easily added using ICU's uidna_toASCII() function. Is it acceptable to add a dependency to ICU for ext/filter? Another option is to add a HAVE_ICU constant in main/php_config.h and to validate IDN only if ICU is present. What strategy is preferred? -- Kévin Dunglas http://dunglas.fr
[PHP-DEV] Re: Internationalized Domain Name support in FILTER_VALIDATE_URL
Hi, The flag is a good idea to handle old systems but the feature must be enabled by default (at least for PHP 7) and disablable through the flag. IDN RFCs are more than 10 years old. All major browsers and registrars support IDN. Le vendredi 19 septembre 2014, Tjerk Meesters tjerk.meest...@gmail.com a écrit : On 19 Sep 2014, at 06:52, Andrea Faulds a...@ajf.me javascript:; wrote: On 18 Sep 2014, at 21:26, Kévin Dunglas dung...@gmail.com javascript:; wrote: I'm working on enhancing the FILTER_VALIDATE_URL filter ( https://github.com/php/php-src/pull/826). The current implementation does not support validation of internationalized domain names (i.e: http://www.académie-française.fr/ http://www.xn--acadmie-franaise-npb1a.fr/ http://www.xn--acadmie-franaise-npb1a.fr/). Support of IDN validation can be easily added using ICU's uidna_toASCII() function. Is it acceptable to add a dependency to ICU for ext/filter? Another option is to add a HAVE_ICU constant in main/php_config.h and to validate IDN only if ICU is present. What strategy is preferred? Perhaps add a new filter that covers normal URLs and IDN ones? I just imagine it might cause problems if suddenly IDNs are accepted, if there is a backend which can’t handle them. We don’t need a new filter, you can simply add a filter flag for FILTER_VALIDATE_URL, e.g. FILTER_FLAG_ALLOW_IDN. Of course, the ICU dependency should be optional :) -- Andrea Faulds http://ajf.me/ -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php -- Kévin Dunglas Consultant et développeur freelance http://dunglas.fr Tél. : 06 60 91 20 20
[PHP-DEV] hash_equals: leak less information about length
Hi, I've submitted a PR to make the hash_equals function leak less information about compared strings' lengths (benchmark and use cases available in comments): https://github.com/php/php-src/pull/792 Trying to hide length is needed to replace Symfony and Joomla PHP implementations by hash_equals (when available). The idea: - clearly advert in the documentation that this function can potentially leak lengths - Try to make it harder for an attacker by using a robuster implementation. If there there is an agreement to use this kind of implementation, I'll rework the PR to use some tricks from the CPython one ( https://github.com/python/cpython/blob/c7688b44387d116522ff53c0927169db45969f0e/Modules/_operator.c#L175 - use of volatile and no modulo). Best regards, -- Kévin Dunglas http://dunglas.fr