[ 
https://issues.apache.org/jira/browse/HTTPCLIENT-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741588#comment-16741588
 ] 

Raymond Cuenen commented on HTTPCLIENT-1960:
--------------------------------------------

[~olegk] I think the {{URIBuilder}} class is mainly designed for the use with 
the {{java.net.URI}} class, while I'm using it to construct a valid URI string 
based upon user input (in this case to configure a Camel endpoint).

I did some more research and found that the {{java.net.URI}} class is actually 
based upon RFC 2396; where the URI was first defined, and not on the RFC 3986; 
where the URI definition was finalized. There's a major difference on what is 
called the "path component" between the two RFC's.

In RFC 2396 all paths are basically absolute; {{abs_path = "/" path_segments}}, 
or you could consider them relative by interpreting the first "/" as component 
separator. When the first "/" is omitted the {{hier_part}} turns into a 
{{opaque_part}}, at which point no further parsing takes place.
 This requires that the path component of an URI always starts with a slash, as 
implemented in {{URIBuilder#normalize}} and tested in 
{{testPathNoLeadingSlash}}.

In RFC 3986 all paths belong to the {{hier-part}}, since there is no 
opaque-part; so paths can be {{path-absolute}}, {{path-rootless}} or 
{{path-empty}}. There's a special case for when there is an authority component 
present; the path component must start with a slash, making that first "/" 
basically a component separator. There's also a special case for when no 
authority component is present and the path component starts with "//"; this is 
not allowed, but can be worked around with by percent-encoding one of the two 
slashes.

Both RFC's allow for as many empty path segments as you like, which is honored 
now also in the {{URIBuilder}} implementation. However, there might be 
scheme-specific constraints on the syntax of the path component or what is 
allowed. I'm not sure if there are any for {{http}} though (its main use?).
 At least this {{URIBuilder}} is (now) a generic URI builder, usable in many 
contexts other then {{http}}. But be advised that because of {{java.net.URI}} 
it is RFC 2396 compliant and not necessarily RFC 3986.
 Take this test for example:
{code:java}
@Test
public void testPathNoAuthorityMultipleLeadingSlash() throws Exception {
    final URI uri = new URIBuilder()
            .setScheme("ftp")
            .setPath("//blah")
            .build();
    Assert.assertThat(uri, CoreMatchers.equalTo(URI.create("ftp://blah";)));
}
{code}
The test succeeds, but the resulting URI has an empty path and a host with the 
value "blah". This is technically not what was entered into the builder.

> URIBuilder incorrect handling of multiple leading slashes in path component
> ---------------------------------------------------------------------------
>
>                 Key: HTTPCLIENT-1960
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1960
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient (async), HttpClient (classic)
>    Affects Versions: 4.5.5
>            Reporter: Raymond Cuenen
>            Assignee: Oleg Kalnichevski
>            Priority: Major
>
> If original path startsWith '/' it is removed by normalizePath; in that case 
> it should be added again URI-encoded. For example: A path value of 
> '/etc/motd' becomes:
> {code:java}
> ftp://myn...@host.dom/etc/motd{code}
> while it should be:
> {code:java}
> ftp://myn...@host.dom/%2Fetc/motd{code}
> Only when the path value is 'etc/motd' is should become 
> "ftp://myn...@host.dom/etc/motd";
>   
> Fix for this issue in URIBuilder.java:
> {noformat}
> private String buildString() {
> ...
>     if (this.encodedPath != null) {
>         sb.append(normalizePath(this.encodedPath, sb.length() == 0));
>     } else if (this.path != null) {
>         String encodedPath = encodePath(normalizePath(this.path, sb.length() 
> == 0));
>         // Start fix for paths starting with '/'
>         // If original path startsWith '/' it is removed by normalizePath; in 
> that case it should be added again URI-encoded.
>         if (this.path.startsWith("/")) {
>             encodedPath = "/%2F" + encodedPath.substring(1);
>         }
>         // End fix
>         sb.append(encodedPath);
>     }
> ...
> }{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org

Reply via email to