On Sun, 10 Oct 2021, Ray Satiro via curl-library wrote:

If someone passes https://%63url.se/ what is the disadvantage to storing
it as https://curl.se/ and only returning it like that?

I don't think there is any disadvantage for that case. The possible disadvantage rather comes when you use non-ASCII for weird cases like https://%c3url.se/ because that will now return the "raw" string "\xc3url.se" for the host. (not that I can think of a reason anyone would provide a name like that)

Is it really necessary to store the encoded version as well when it's ascii only?

That's not what's being done. It was only mentioned as an option but it's not an option anyone likes.

I've never heard of anyone doing percent encoded ascii hostnames before.

Percent encoded ascii hostnames seems to be very rare in general, which probably is the reason this hasn't been reported before. After all, curl has parsed a few URLs over the years and this hasn't been reported until now! [*]

However, the URL syntax says they can be provided like that so we should support it. To make our parser behave more in line with the spec and more similar to how other parsers (are expected to) behave.

[*] = this issue has four "reported-by" names in the bug report simply because those are the four authors of a paper on URL parsers, their differences and associated problems with that, that is in the works and that I have reviewed.

--

 / daniel.haxx.se
 | Commercial curl support up to 24x7 is available!
 | Private help, bug fixes, support, ports, new features
 | https://curl.se/support.html
--
Unsubscribe: https://lists.haxx.se/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Reply via email to