Hi
Am 2025-03-27 23:49, schrieb Ignace Nyamagana Butera:
Hi Máté,
for RFC 3986:
https://datatracker.ietf.org/doc/html/rfc3986#section-5.3), and then
this string is parsed and validated. Unfortunately, I recently
realized that this approach may leave room for some kind of parsing
confusion attack, namely when the scheme is for example "https", the
authority is empty, and the path is "example.com
<http://example.com>". This will result in a https://example.com
URI. I believe a similar bug is not possible with the rest of the
components because they have their delimiters. So possibly some
other solution will be needed, or maybe adding some additional
validation (?).
This is not correct according to RFC3986
https://datatracker.ietf.org/doc/html/rfc3986#section-3
*When authority is present, the path must either be empty or begin with
a slash ("/") character. When authority is not present, the path cannot
begin with two slash characters ("//"). *
So in your example it should throw an Uri\InvalidUriException 🙂 for
RFC3986 and in case of the WhatwgUrl algorithm it should trigger a soft
error and correct the behaviour for the http(s) schemes.
This is also one of the many reasons why at least for RFC3986 the path
component can never be `null` but that's another discussion. Like I
said having a `fromComponenta` named constructor would allow the
"removal" of the need for a UriBuilder (in your future section) and
would IMHO be useful outside of the context of the http(s) scheme but I
can understand it being left out of the current implementation it might
be brought back for future improvements.
I just tested this with the implementation and it also appears to not
yet be correct:
var_dump((new Uri\Rfc3986\Uri("example.com"))->getHost()); // NULL
var_dump((new
Uri\Rfc3986\Uri("example.com"))->withScheme('https')->getHost()); //
string(11) "example.com"
var_dump((new
Uri\Rfc3986\Uri("example.com"))->withScheme('https')->toRawString()); //
string(19) "https://example.com"
and
var_dump((new
Uri\Rfc3986\Uri("foo/bar"))->withPath('//foo/bar')->getHost()); //
string(3) "foo"
Best regards
Tim Düsterhus