On Mon, Jul 14, 2025 at 11:26 PM Andrey Andreev <n...@devilix.net> wrote:
> Hi all, > > I have a few suggestions, starting with naming improvements: > - Forgiving instead of Lenient (align with > https://infra.spec.whatwg.org/#forgiving-base64) > - Shorten the option names; one example would be Variable/Constant instead > of Unprotected/ConstantTime, but I think most could be rethinked > - $input or $data instead of $decoded (could actually do the same instead > of $encoded, but that one doesn't feel as wrong) > - Not strictly about naming, but it similarly feels wrong that > UnableToDecodeException extends EncodingException (which seems to have no > purpose) > > However, I'm not a fan of how these simple functions have so many option > flags ... it feels forced, trying to accomodate too much at once. I'd > rather have discrete functions, like base64_*() and base64url_*() - I chose > this example because base64 and base64url also have arguably different > desirable defaults for padding; almost all pad-stripping I've seen in the > wild has been for the purposes of converting to base64url. > On a semi-related note, I'm not sure if including the IMAP variant isn't > complicating things for no good reason (it is extra-niche, and we have > imap_binary/base64() already). > > Also, the RFC doesn't specify whether DecodingMode::Strict would cause an > error in case of missing padding? > > That being said, I'm very glad to see this! > > Cheers, > Andrey. > Hi Andrey, Forgiving instead of Lenient (align with https://infra.spec.whatwg.org/#forgiving-base64) > I will adapt the text and use `Forgiving` instead Shorten the option names; one example would be Variable/Constant instead of Unprotected/ConstantTime, but I think most could be rethinked I will adapt the text and use `Variable/Constant` instead, thanks for the suggestions, $input or $data instead of $decoded (could actually do the same instead of $encoded, but that one doesn't feel as wrong) > Usage of `$encoded` and `$decoded` as parameter names is done to emphasize the *state of the data**,* rather than its format. This is helpful as it avoids ambiguity ( `$data` is generic) and makes data flow more explicit. Not strictly about naming, but it similarly feels wrong that UnableToDecodeException extends EncodingException (which seems to have no purpose) This follows the RFC guidelines regarding the introduction of new exceptions to the language, particularly within extensions. Each exception should reference its own exception marker (in this proposal, `EncodingException`). Additionally, we introduce a more specific exception to handle errors that occur during the decoding of encoded data. On a semi-related note, I'm not sure if including the IMAP variant isn't complicating things for no good reason (it is extra-niche, and we have imap_binary/base64() already). The `ext/imap` extension from which those functions are coming from [has been unbundled from PHP](https://wiki.php.net/rfc/unbundle_imap_pspell_oci8) I chose this example because base64 and base64url also have arguably different desirable defaults for padding; almost all pad-stripping I've seen in the wild has been for the purposes of converting to base64url. Base64 and Base64url vary on their alphabet and on the presence or absence of the padding string. With the proposed API it would mean doing the following ```php \Encoding::base64_encode('Hello world!'); //base64 standard encoding \Encoding::base64_encode('Hello world!', variant: \Encoding\Base64::UrlSafe); //base64 URL Safe encoding ``` Padding is by default controlled by the variant. Since UrlSafe does not need padding no padding will be used. You should not even need to specify the presence or absence of padding. Unless you want to do something really specific for your use case. In which case being explicit in what you want to achieve is always a good design choice. The default values for the options are chosen to cover the most common use cases, so in many situations you won’t need to specify them at all—making the API easier to use than it might initially appear. Also, the RFC doesn't specify whether DecodingMode::Strict would cause an error in case of missing padding? Strict decoding behavior depends on the variant. For example, in the case of Base64url, padding is considered optional. Therefore, under `DecodingMode::Strict`, the absence of `=` padding characters will not trigger an exception, as this behavior is compliant with the relevant RFC. In contrast, for `Base64::Standard`, omitting the padding character *in strict mode *will result in an exception, since padding is mandatory where applicable with such a variant. For clarity, I will revise the RFC to explicitly state the behavior of each encoding variant during strict mode decoding. Best regards, Ignace Nyamagana Butera