Diwaker Gupta created JCLOUDS-217:
-------------------------------------

             Summary: URL encoding/decoding issues
                 Key: JCLOUDS-217
                 URL: https://issues.apache.org/jira/browse/JCLOUDS-217
             Project: jclouds
          Issue Type: Bug
          Components: jclouds-core
            Reporter: Diwaker Gupta


Over the last 2 days I spent several (frustrating) hours trying to understand 
URL encoding/decoding in the jclouds code base. In the process I uncovered 
several issues. This is an umbrella ticket to capture them. Appropriate 
subtasks should probably be created as and when things are fixed.

h3. Query parameters need to be decoded prior to calling {{addQueryParameter}}

{{HttpRequest.Builder.addQueryParameter}} silently tries to decode both key and 
value (via {{DecodingMultimap}}). So if you don't pass in an encoded value, it 
can get decoded into the wrong string. E.g., consider the following code:

{code}
        HttpRequest request = HttpRequest.builder().method("GET")
            .endpoint("http://foo.com";)
            .addQueryParam("foo", "bar+baz")
            .build();
        System.out.println(request.getRequestLine());
 
# Prints
GET http://foo.com?foo=bar%20baz HTTP/1.1

# Note the %20 above. '+' should actually be encoded as '%2B'
# This does the right thing
        HttpRequest request = HttpRequest.builder().method("GET")
            .endpoint("http://foo.com";)
            .addQueryParam("foo", URLEncoder.encode("bar+baz", "UTF-8"))
            .build();
        System.out.println(request.getRequestLine());

# Prints
GET http://foo.com?foo=bar%2Bbaz HTTP/1.1
{code}

h3. Query parameter encoding depends on the order in which they are added

It seems like the _expected_ behavior is that a plus '+' gets encoded 
differently depending on whether or not the query parameter in which it appears 
is the last parameter (see 
{{HttpRequestTest.testAddBase64AndUrlEncodedQueryParams}}). Compare:

{code}
        HttpRequest request1 = HttpRequest.builder().method("GET")
            .endpoint("http://foo.com";)
            .addQueryParam("foo", value)
            .addQueryParam("a", "b")
            .build();
        HttpRequest request2 = HttpRequest.builder().method("GET")
            .endpoint("http://foo.com";)
            .addQueryParam("a", "b")
            .addQueryParam("foo", value)
            .build();
        System.out.println(request1.getRequestLine());
        System.out.println(request2.getRequestLine());

# Prints
GET http://foo.com?foo=bar%20baz&a=b HTTP/1.1
GET http://foo.com?a=b&foo=bar%2Bbaz HTTP/1.1

# Notice the %20 vs %2B above
{code}

h3. {{Strings2.urlEncode}} and {{Strings2.urlDecode}} don't have symmetric 
implementations

{{Strings2.urlEncode}} calls {{URLEncoder.encode}} and then additionally 
substitutes '+' and '*' in order to be more browser friendly. 
{{Strings2.urlDecode}} on the other hand simply wraps {{URLDecoder.decode}}. 
While this is fine from a functionality standpoint, having a symmetric inverse 
implementation is easier to reason about and would guard us against future bugs 
in {{URLEncoder}}. At the minimum {{Strings2.urlEncode}} should enforce that no 
'+' and '*' signs exist in the intermediate encoded string.

h3. Every call in {{HttpRequest.Builder}} creates a _new_ UriBuilder

E.g.:

{code}
# uriBuilder(endpoint) internally calls new UriBuilder

core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
uriBuilder(endpoint).addQuery(name, values).build();
core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
uriBuilder(endpoint).addQuery(name, values).build();
core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
uriBuilder(endpoint).addQuery(parameters).build();
core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
uriBuilder(endpoint).replaceQuery(name, values).build();
core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
uriBuilder(endpoint).replaceQuery(name, values).build();
core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
uriBuilder(endpoint).replaceQuery(parameters).build();
core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = 
uriBuilder(endpoint).path(path).build();
{code}

Given that jclouds is largely driven by HTTP requests, this is clearly 
sub-optimal.

I'll add more issues as I find them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to