Re: Semicolon not allowed in userinfo

2023-10-03 Thread Daniel Stenberg

On Tue, 3 Oct 2023, Tim Rühsen wrote:


My  version of curl (8.3.0) doesn't accept it:

curl -vvv 'http://a ;b:c@xyz'
* URL rejected: Malformed input to a URL function


That's in no way a legal URL (accortding to RFC 3986) and it is not the 
semicolon that causes curl to reject it. It is the space.


But I don't know if that is maybe your clients or the mailing list software 
that botched it so badly?


--

 / daniel.haxx.se


Re: Semicolon not allowed in userinfo

2023-10-03 Thread Tim Rühsen

Hi,

On 10/2/23 10:55, Bachir Bendrissou wrote:

Hi,

The following url example contains a semicolon in the userinfo segment:


*http://a ;b:c@xyz*
Wget rejects this url with the following error message:

*http://a ;b:c@xyz: Bad port number.*

It seems that Wget sees "c" as a port number. When "c" is replaced by a
digit, Wget accepts the url and attempts to resolve "xyz".


Wget doesn't follow the current specs and the parsing is lenient to 
accept some types of badly formatted URLs seen in the wild.


But we should possibly become more strict and compliant to current specs.



It's worth noting that curl and aria2 both accept the url example.


My  version of curl (8.3.0) doesn't accept it:

curl -vvv 'http://a ;b:c@xyz'
* URL rejected: Malformed input to a URL function
* Closing connection
curl: (3) URL rejected: Malformed input to a URL function

All the URL parsers are slightly different when it comes to edge cases.
I'd consider curl as a good reference.


Why is the semicolon not allowed in userinfo, despite that other special
characters are allowed?


First of all, userinfo does not allow spaces at all (look at 
https://datatracker.ietf.org/doc/html/rfc3986).

  userinfo= *( unreserved / pct-encoded / sub-delims / ":" )
  unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
  sub-delims  = !$&'()*+,;=
  pct-encoded = "%" HEXDIG HEXDIG



Thank you,
Bachir


Regards, Tim


OpenPGP_signature.asc
Description: OpenPGP digital signature