[PHP-DEV] Re: DOMXPath / XSLTProcessor function callbacks

2023-10-16 Thread Niels Dossche
Hi Frederik

Sorry for the resend... I accidentally replied to you only without including 
the list the first time.

On 15/10/2023 21:37, Frederik Bosch wrote:
> Dear Niels,
> 
> First of all, thanks for all your hard work already on the DOM and SimpleXML 
> extensions. I have been following your work in PHP-SRC, great! I am the 
> author of this XSL 2.0 Transpiler in PHP package 
> (https://github.com/genkgo/xsl). It is indeed possible to use workarounds for 
> closures or object methods. I am relying on them in my package.
> 
> My suggestion for the future would be to add the following method.
> 
> public XSLTProcessor::registerFunctionsNS(string $namespace, 
> array|ArrayAccess $functions): void
> 
> Then a user can register functions like this.
> 
> $xsltProcessorOrDomXpath->registerFunctionsNS('urn:my.namespace', 
> array('upper-case', 'strtoupper', 'pow' => fn ($a, $b) => $a[0]->textContent 
> ** $b[0]->textContent, 'other' => [$obj, 'method']);
> 

Interesting suggestion. So you want to be able to use something like 
`my.namespace:function(...)` as I understand it.
I'm not sure that adding this is that much more beneficial though (complexity 
vs benefit trade-off).
I assume this is motivated by the fact that you can then use third party 
libraries while having to worry less about name clashes?
Let's say we add non-namespace `registerFunction(string $name, callable 
$callback): void`, you can then still use a convention of using a prefix, thus 
_kinda_ achieving the same.

In any case, this is going to be hard to support in combination with the 
underlying library (libxslt).
That's because the function namespace registration is process-wide, so this 
cannot be changed at runtime and certainly not for ZTS SAPIs.

> The registered functions should use the same methodology as php:function(). 
> Hence, string casting of arguments is something the library user should do. I 
> would leave registerPHPFunctions as is, and maybe discourage it in favor of 
> the method above. What if both are called? I think it would be most clear if 
> the registerFunctionsNS method throws InvalidArgumentException when 
> http://php.net/xsl or http://php.net/xpath is passed as namespace.
> 
> Cheers,
> Frederik

Cheers
Niels

> 
> 
> On 13-10-2023 00:39, Niels Dossche wrote:
>> I'm looking to extend the functionality of calling PHP functions from within 
>> the DOMXPath or XSLTProcessor classes.
>>
>> In case you're unfamiliar here's a quick rundown.
>> The DOMXPath class allows you to execute XPath queries on a DOM tree to 
>> lookup certain nodes satisfying a filter.
>> PHP allows the user to execute function callbacks within these. For example 
>> (from the manual):
>>    $xpath->query('//book[php:functionString("substr", title, 0, 3) = 
>> "PHP"]');
>> This will read the title element's text content, call substr on it, and then 
>> compare the output against "PHP".
>> You can not only call builtin functions, but also user functions.
>>
>> To be able to call PHP functions, you need to use 
>> DOMXPath::registerPhpFunctions() 
>> (https://www.php.net/manual/en/domxpath.registerphpfunctions.php).
>> You either pass in NULL to allow all functions, or pass in which function 
>> names are allowed to be called.
>>
>> Similarly, XSLTProcessor has the same registerPhpFunctions() method.
>> For XSLT it's mostly used for performing arbitrary manipulations on input 
>> data.
>> Normally the output of the function is put into the resulting document.
>>
>>
>> So what's the problem?
>> The current system doesn't allow you to call closures or object methods.
>> There are tricks you can do with global variables and global functions to 
>> try to work around this, but that's quite cumbersome.
>>
>> There are two feature requests for this on the old bugtracker:
>>    - https://bugs.php.net/bug.php?id=38595
>>    - https://bugs.php.net/bug.php?id=49567
>>
>> It's not hard to implement support for this, the question is just what API 
>> we should go with.
>> Based on what I've read, there are at least two obvious options:
>>
>>
>> OPTION 1) Extend registerPHPFunctions() such that you can pass in callables
>>
>> ```
>> // Adapted from https://bugs.php.net/bug.php?id=38595
>> $xslt->registerPHPFunctions(array(
>> 'functionblah', // Like we used to
>> 'func2' => fn ($x) => ...,
>> 'func3' => array($obj, 'method'), // etc
>> ));
>> ```
>>
>> Example: Using php:function("func3") inside XPath/XSLT in this case will 
>> result in calling method on $obj.
>> Similarly func2 will call the closure, and functionblah in the snippet just 
>> allowlists calling functionblah.
>>
>> It's a backwards compatible solution and a natural extension to the current 
>> method.
>> It may be hard to discover this feature compared to having a new API though.
>>
>> Furthermore, once you pass in function names to registerPHPFunctions(), 
>> you're restricting what can be called.
>> For example: imagine you want to call both ucfirst() and $obj->method(), so 

[PHP-DEV] Re: [RFC] [VOTE] DOM HTML5 parsing and serialization

2023-10-16 Thread Niels Dossche
On 02/10/2023 20:19, Niels Dossche wrote:
> Hi internals
> 
> I just opened the vote on my RFC "DOM HTML5 parsing and serialization".
> RFC link: https://wiki.php.net/rfc/domdocument_html5_parser
> Discussion (externals.io): https://externals.io/message/120972
> 
> Voting will run for two weeks until 2023-10-16 20:20 GMT+2
> There is one primary vote (2/3 majority) and one secondary vote (50% 
> majority).
> 
> Kind regards
> Niels

Hi internals

The RFC has been accepted with 16 yes votes, and 0 no votes.
The secondary vote, an API implementation detail: "DOM\HTMLDocument::fromFile 
should respect the resolver set by libxml_set_external_entity_loader" was 
declined.
Thanks for everyone who voted. And a special thanks to Tim and Arne for the 
many discussions about the API design.

Cheers
Niels

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Previous discussions about generics syntax only?

2023-10-16 Thread Alex Wells
On Mon, Oct 16, 2023 at 5:08 PM Olle Härstedt 
wrote:

> Hello internals,
>
> Was there a previous discussion about the pros/cons of adding only the
> syntax needed for generics, but not the functionality? So static
> analyzers could use it, instead of docblocks. I looked at externals.io
> but couldn't find anything specific.
>

Hey. There was a somewhat recent discussion on GH, but without experimental
features system, it's unlikely it would pass:
https://github.com/PHPGenerics/php-generics-rfc/issues/49


Re: [PHP-DEV] Previous discussions about generics syntax only?

2023-10-16 Thread Benjamin Morel
Hi, yes there was, back in 2020: https://externals.io/message/111875

- Benjamin

On Mon, 16 Oct 2023 at 16:08, Olle Härstedt  wrote:

> Hello internals,
>
> Was there a previous discussion about the pros/cons of adding only the
> syntax needed for generics, but not the functionality? So static
> analyzers could use it, instead of docblocks. I looked at externals.io
> but couldn't find anything specific.
>
> Regards
> Olle
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: https://www.php.net/unsub.php
>
>


[PHP-DEV] Previous discussions about generics syntax only?

2023-10-16 Thread Olle Härstedt
Hello internals,

Was there a previous discussion about the pros/cons of adding only the
syntax needed for generics, but not the functionality? So static
analyzers could use it, instead of docblocks. I looked at externals.io
but couldn't find anything specific.

Regards
Olle

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Better name for method Randomizer::nextFloat()

2023-10-16 Thread Tim Düsterhus

Hi

On 10/16/23 10:34, Christian Schneider wrote:

Am 15.10.2023 um 19:24 schrieb Tim Düsterhus :

Making getFloat(float $min = 0.0, float $max = 1.0, IntervalBoundary $boundary = 
IntervalBoundary::ClosedOpen) would seemingly make it legal to call 
->getFloat(0.5), which I consider to be worse than nextFloat().


While I understand that you find getFloat(0.5) undesirable I would not consider 
it illegal, e.g. getFloat(max:5).
Additionally I would consider getInt(max:100) to be a very valid usage too.


I don't, because I prefer explicit calls. If the upper bound is 
specified explicitly, the lower bound should be as well. In fact the 
`0,` is even shorter than spelling out `max:`.



Side note: The implementation of nextFloat() is much more efficient than that 
of getFloat(0.0, 1.0) and the output will differ even for the same seeded 
engine. The domain of legal return values is identical.


This sounds like an implementation detail as I'm pretty sure you could switch 
to that fast path easily enough if neither of the arguments were given.


Yes, that's why it's a side note.


Side-note: The case of nextInt() is a bit trickier and strikes me as a bit of a 
weird API and I'd consider dropping it too: Having the max value depend on the 
Randomizer engine makes the generated values unstable in regards to switch the 
Engine. And if I read the documentation correctly then it only generates 32 bit 
values (or should it be 31 bits as it returns positive values only?). I think 
it would be clearer to require the Engine to provide at least 4 bytes and then 
specify the default max value of getInt() to be 2^31, optionally overridable 
with something up to PHP_INT_MAX.


Yes, Randomizer::nextInt() is terrible. Let me provide the context why 
it exists like it exists:


1. The Randomizer was designed to be drop-in compatible with the 
existing randomness functionality (mt_rand, array_rand, shuffle, 
strshuffle) when combining it with the Mt19937 engine.


2. Thus mt_rand is equivalent to Randomizer::getInt().

3. It is legal to call mt_rand() without parameters.

4. This behavior was carried over into getInt().

5. During the final review, it was noticed that this is an overloaded 
function and a decision was made not to do this, because overloaded 
functions are terrible.


6. Thus the getInt()-without-parameters call was split into nextInt().

I think that the goal of making the new API a drop-in replacement has 
been a useful goal to make it easier to migrate and I assume that's why 
no one questioned whether retrieving "an arbitrary positive integer" is 
even useful in the first place. While writing the documentation I first 
realized how useless that is, because I was unable to write something 
useful for Randomizer::nextInt(), but at that point it was way too late. 
Hindsight is 20/20.


Given that knowledge, is naming `nextFloat()` like `nextInt()` a 
mistake? I'm not sure. It was the obvious naming choice to me back when 
I wrote the RFC a year ago and no one pointed out how the name was not 
ideal, not during the RFC discussion phase, nor during the vote or in 
the half year leading up to the feature freeze. Now 6 weeks before the 
gold release, with 4 release candidates released and several blog posts 
written, the method name suddenly is a bad choice?



So in summary I'd support both adding default values to getInt()/getFloat() as 
well as dropping nextFloat() in favor of getFloat() (and possible nextInt() 
with getInt()).



I'd be on board with deprecating nextInt() in PHP 8.4. I'd also be on 
board with creating a better-named alias for nextFloat(), but I believe 
that it should remaing a standalone method, because of its usefulness.


Best regards
Tim Düsterhus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Peculiar behaviour of ext/xml set_X_handler() functions

2023-10-16 Thread G. P. B.
On Tue, 10 Oct 2023 at 14:52, G. P. B.  wrote:

> Hello internals,
>
> While doing some refactoring of ext/xml to bring it up to modern engine
> standards, I discovered some peculiar behaviour of the functions which set
> callable handlers.
>
> The PR in question is: https://github.com/php/php-src/pull/12340
>
> The issue is that this function doesn't just accept the usual callabes,
> but also passing a method name that should be called on an object provided
> by xml_set_object().
>
> Moreover, contrary to the usual semantics of checking if the value is
> callable when passed to the function, it is checked when attempting to call
> it.
> The PR changes this to validate that the function or method name provided
> is actually callable.
> This requires a BC break in that xml_set_object() MUST be called prior to
> setting the method names, otherwise a ValueError is now thrown.
>
> The only project I could find that called xml_set_object() after already
> setting method names for handlers is semsol/arc2
>  to which I have sent a PR to update this.
>
> The existence of the already set methods is now also checked when calling
> xml_set_object() again with a different object to ensure that the handlers
> are always well-defined.
>
> The plan is to also deprecate this feature out right, but this can be done
> by adding it to the mass 8.4 deprecation RFC. [1]
>
> Point of this email is to know if anyone has any complaints about this BC
> break before merging the PR into master.
>
> Best regards,
>
> George P. Banyard
>
> [1] https://wiki.php.net/rfc/deprecations_php_8_4
>

If no one complains I will merge the PR at the end of the week.

Best regards,

George P. Banyard


Re: [PHP-DEV] Better name for method Randomizer::nextFloat()

2023-10-16 Thread G. P. B.
On Mon, 16 Oct 2023 at 09:34, Christian Schneider 
wrote:

> > Side note: The implementation of nextFloat() is much more efficient than
> that of getFloat(0.0, 1.0) and the output will differ even for the same
> seeded engine. The domain of legal return values is identical.
>
> This sounds like an implementation detail as I'm pretty sure you could
> switch to that fast path easily enough if neither of the arguments were
> given.
>

This is not how the engine works, it would require an if-clause on every
single call of getFloat() that would probably defeat the point of it.

Best regards,

George P. Banyard


Re: [PHP-DEV] Two new functions array_first() and array_last()

2023-10-16 Thread Pierre

Le 15/10/2023 à 18:09, Larry Garfield a écrit :

That has already been done:https://www.php.net/array_is_list

--Larry Garfield


Oh, I forgot it was accepted and merged, thanks for pointing at it.

Cheers,

Pierre



Re: [PHP-DEV] Better name for method Randomizer::nextFloat()

2023-10-16 Thread Christian Schneider
Am 15.10.2023 um 19:24 schrieb Tim Düsterhus :
> Making getFloat(float $min = 0.0, float $max = 1.0, IntervalBoundary 
> $boundary = IntervalBoundary::ClosedOpen) would seemingly make it legal to 
> call ->getFloat(0.5), which I consider to be worse than nextFloat().

While I understand that you find getFloat(0.5) undesirable I would not consider 
it illegal, e.g. getFloat(max:5).
Additionally I would consider getInt(max:100) to be a very valid usage too.

> Side note: The implementation of nextFloat() is much more efficient than that 
> of getFloat(0.0, 1.0) and the output will differ even for the same seeded 
> engine. The domain of legal return values is identical.

This sounds like an implementation detail as I'm pretty sure you could switch 
to that fast path easily enough if neither of the arguments were given.

Side-note: The case of nextInt() is a bit trickier and strikes me as a bit of a 
weird API and I'd consider dropping it too: Having the max value depend on the 
Randomizer engine makes the generated values unstable in regards to switch the 
Engine. And if I read the documentation correctly then it only generates 32 bit 
values (or should it be 31 bits as it returns positive values only?). I think 
it would be clearer to require the Engine to provide at least 4 bytes and then 
specify the default max value of getInt() to be 2^31, optionally overridable 
with something up to PHP_INT_MAX.

So in summary I'd support both adding default values to getInt()/getFloat() as 
well as dropping nextFloat() in favor of getFloat() (and possible nextInt() 
with getInt()). 

Regards,
- Chris

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php