[ 
https://issues.apache.org/jira/browse/JCLOUDS-217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Phillips updated JCLOUDS-217:
------------------------------------

    Affects Version/s: 1.8.0

> URL encoding/decoding issues
> ----------------------------
>
>                 Key: JCLOUDS-217
>                 URL: https://issues.apache.org/jira/browse/JCLOUDS-217
>             Project: jclouds
>          Issue Type: Bug
>          Components: jclouds-core
>    Affects Versions: 1.8.0
>            Reporter: Diwaker Gupta
>         Attachments: JCLOUDS-217-1.patch
>
>
> Over the last 2 days I spent several (frustrating) hours trying to understand 
> URL encoding/decoding in the jclouds code base. In the process I uncovered 
> several issues. This is an umbrella ticket to capture them. Appropriate 
> subtasks should probably be created as and when things are fixed.
> h3. Query parameters need to be decoded prior to calling {{addQueryParameter}}
> {{HttpRequest.Builder.addQueryParameter}} silently tries to decode both key 
> and value (via {{DecodingMultimap}}). So if you don't pass in an encoded 
> value, it can get decoded into the wrong string. E.g., consider the following 
> code:
> {code}
>         HttpRequest request = HttpRequest.builder().method("GET")
>             .endpoint("http://foo.com";)
>             .addQueryParam("foo", "bar+baz")
>             .build();
>         System.out.println(request.getRequestLine());
>  
> # Prints
> GET http://foo.com?foo=bar%20baz HTTP/1.1
> # Note the %20 above. '+' should actually be encoded as '%2B'
> # This does the right thing
>         HttpRequest request = HttpRequest.builder().method("GET")
>             .endpoint("http://foo.com";)
>             .addQueryParam("foo", URLEncoder.encode("bar+baz", "UTF-8"))
>             .build();
>         System.out.println(request.getRequestLine());
> # Prints
> GET http://foo.com?foo=bar%2Bbaz HTTP/1.1
> {code}
> h3. Query parameter encoding depends on the order in which they are added
> It seems like the _expected_ behavior is that a plus '+' gets encoded 
> differently depending on whether or not the query parameter in which it 
> appears is the last parameter (see 
> {{HttpRequestTest.testAddBase64AndUrlEncodedQueryParams}}). Compare:
> {code}
>         HttpRequest request1 = HttpRequest.builder().method("GET")
>             .endpoint("http://foo.com";)
>             .addQueryParam("foo", value)
>             .addQueryParam("a", "b")
>             .build();
>         HttpRequest request2 = HttpRequest.builder().method("GET")
>             .endpoint("http://foo.com";)
>             .addQueryParam("a", "b")
>             .addQueryParam("foo", value)
>             .build();
>         System.out.println(request1.getRequestLine());
>         System.out.println(request2.getRequestLine());
> # Prints
> GET http://foo.com?foo=bar%20baz&a=b HTTP/1.1
> GET http://foo.com?a=b&foo=bar%2Bbaz HTTP/1.1
> # Notice the %20 vs %2B above
> {code}
> h3. {{Strings2.urlEncode}} and {{Strings2.urlDecode}} don't have symmetric 
> implementations
> {{Strings2.urlEncode}} calls {{URLEncoder.encode}} and then additionally 
> substitutes '+' and '*' in order to be more browser friendly. 
> {{Strings2.urlDecode}} on the other hand simply wraps {{URLDecoder.decode}}. 
> While this is fine from a functionality standpoint, having a symmetric 
> inverse implementation is easier to reason about and would guard us against 
> future bugs in {{URLEncoder}}. At the minimum {{Strings2.urlEncode}} should 
> enforce that no '+' and '*' signs exist in the intermediate encoded string.
> h3. Every call in {{HttpRequest.Builder}} creates a _new_ UriBuilder
> E.g.:
> {code}
> # uriBuilder(endpoint) internally calls new UriBuilder
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
> uriBuilder(endpoint).addQuery(name, values).build();
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
> uriBuilder(endpoint).addQuery(name, values).build();
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
> uriBuilder(endpoint).addQuery(parameters).build();
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
> uriBuilder(endpoint).replaceQuery(name, values).build();
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
> uriBuilder(endpoint).replaceQuery(name, values).build();
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
> uriBuilder(endpoint).replaceQuery(parameters).build();
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
> uriBuilder(endpoint).path(path).build();
> {code}
> Given that jclouds is largely driven by HTTP requests, this is clearly 
> sub-optimal.
> I'll add more issues as I find them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to