Hi

On 9/16/23 01:17, Niels Dossche wrote:
[…]

Thank you for the fruitful discussion. This updated API is much better.

When reading the updated RFC yesterday, I initially wanted to complain about the "The options argument" section still existing, but the $options argument nowhere to be seen. But I see that you now added it.

One thing I now note that could also be improved, now that we make some larger improvements to the API:

The $options argument still takes an integer and expects the developer to use the bitwise-or operator to combine multiple constants to set the appropriate flags. Despite this being a commonly used pattern in the standard library, I think it provides for a bad developer experience and makes it harder to extend the list of options for non-boolean options. I would never design an API like this when creating a userland library.

I have two suggestions that might or might not be better:

1. Make the options explicit arguments with a default value and expect the user to leverage named arguments to override specific arguments, e.g.

    HtmlDocument::fromString(
      string $string,
      bool $ignoreErrors = false, // LIBXML_NOERROR
      bool $reducedMemoryUsage = false, // LIBXML_COMPACT
      bool $impliedTags = true, // LIBXML_HTML_NOIMPLIED
    );

This would provide for improved discoverability and the options checking would be implicit and easily available in IDEs.

2. Make the option parameter an array that takes an array of flags. This could either be an associative array (like used with unserialize, setcookie or password_hash) or a list of self-descriptive options. The latter would allow for a clean(er) migration to tagged-unions if/when they make it into PHP.

Example usage:

    HtmlDocument::fromString(
      '<html></html>',
      [
        LIBXML_NOERROR,
        LIBXML_COMPACT,
        LIBXML_HTML_NOIMPLIED,
      ],
    );

With tagged-unions it could look like this in a future version, note the HtmlVersion option that takes a parameter.

    HtmlDocument::fromString(
      '<html></html>',
      [
        Dom\LoadingOption::IgnoreError,
        Dom\LoadingOption::ReducedMemoryUsage,
        Dom\LoadingOption::NoImpliedTags,
        Dom\LoadingOption::HtmlVersion(5),
      ],
    );

Even if not yet reaping the full benefits of the "list of options", I think it more naturally splits across multiple lines compared to 'bitwise-or'ing and it can also be more useful in stack traces for error handlers that are able to handle arrays, because there won't be some non-descript integer you have to decode manually.

What do you think?

Best regards
Tim Düsterhus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to