[
https://issues.apache.org/jira/browse/HTTPCLIENT-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741588#comment-16741588
]
Raymond Cuenen commented on HTTPCLIENT-1960:
--------------------------------------------
[~olegk] I think the {{URIBuilder}} class is mainly designed for the use with
the {{java.net.URI}} class, while I'm using it to construct a valid URI string
based upon user input (in this case to configure a Camel endpoint).
I did some more research and found that the {{java.net.URI}} class is actually
based upon RFC 2396; where the URI was first defined, and not on the RFC 3986;
where the URI definition was finalized. There's a major difference on what is
called the "path component" between the two RFC's.
In RFC 2396 all paths are basically absolute; {{abs_path = "/" path_segments}},
or you could consider them relative by interpreting the first "/" as component
separator. When the first "/" is omitted the {{hier_part}} turns into a
{{opaque_part}}, at which point no further parsing takes place.
This requires that the path component of an URI always starts with a slash, as
implemented in {{URIBuilder#normalize}} and tested in
{{testPathNoLeadingSlash}}.
In RFC 3986 all paths belong to the {{hier-part}}, since there is no
opaque-part; so paths can be {{path-absolute}}, {{path-rootless}} or
{{path-empty}}. There's a special case for when there is an authority component
present; the path component must start with a slash, making that first "/"
basically a component separator. There's also a special case for when no
authority component is present and the path component starts with "//"; this is
not allowed, but can be worked around with by percent-encoding one of the two
slashes.
Both RFC's allow for as many empty path segments as you like, which is honored
now also in the {{URIBuilder}} implementation. However, there might be
scheme-specific constraints on the syntax of the path component or what is
allowed. I'm not sure if there are any for {{http}} though (its main use?).
At least this {{URIBuilder}} is (now) a generic URI builder, usable in many
contexts other then {{http}}. But be advised that because of {{java.net.URI}}
it is RFC 2396 compliant and not necessarily RFC 3986.
Take this test for example:
{code:java}
@Test
public void testPathNoAuthorityMultipleLeadingSlash() throws Exception {
final URI uri = new URIBuilder()
.setScheme("ftp")
.setPath("//blah")
.build();
Assert.assertThat(uri, CoreMatchers.equalTo(URI.create("ftp://blah")));
}
{code}
The test succeeds, but the resulting URI has an empty path and a host with the
value "blah". This is technically not what was entered into the builder.
> URIBuilder incorrect handling of multiple leading slashes in path component
> ---------------------------------------------------------------------------
>
> Key: HTTPCLIENT-1960
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1960
> Project: HttpComponents HttpClient
> Issue Type: Bug
> Components: HttpClient (async), HttpClient (classic)
> Affects Versions: 4.5.5
> Reporter: Raymond Cuenen
> Assignee: Oleg Kalnichevski
> Priority: Major
>
> If original path startsWith '/' it is removed by normalizePath; in that case
> it should be added again URI-encoded. For example: A path value of
> '/etc/motd' becomes:
> {code:java}
> ftp://[email protected]/etc/motd{code}
> while it should be:
> {code:java}
> ftp://[email protected]/%2Fetc/motd{code}
> Only when the path value is 'etc/motd' is should become
> "ftp://[email protected]/etc/motd"
>
> Fix for this issue in URIBuilder.java:
> {noformat}
> private String buildString() {
> ...
> if (this.encodedPath != null) {
> sb.append(normalizePath(this.encodedPath, sb.length() == 0));
> } else if (this.path != null) {
> String encodedPath = encodePath(normalizePath(this.path, sb.length()
> == 0));
> // Start fix for paths starting with '/'
> // If original path startsWith '/' it is removed by normalizePath; in
> that case it should be added again URI-encoded.
> if (this.path.startsWith("/")) {
> encodedPath = "/%2F" + encodedPath.substring(1);
> }
> // End fix
> sb.append(encodedPath);
> }
> ...
> }{noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]