Diwaker Gupta created JCLOUDS-217:
-------------------------------------
Summary: URL encoding/decoding issues
Key: JCLOUDS-217
URL: https://issues.apache.org/jira/browse/JCLOUDS-217
Project: jclouds
Issue Type: Bug
Components: jclouds-core
Reporter: Diwaker Gupta
Over the last 2 days I spent several (frustrating) hours trying to understand
URL encoding/decoding in the jclouds code base. In the process I uncovered
several issues. This is an umbrella ticket to capture them. Appropriate
subtasks should probably be created as and when things are fixed.
h3. Query parameters need to be decoded prior to calling {{addQueryParameter}}
{{HttpRequest.Builder.addQueryParameter}} silently tries to decode both key and
value (via {{DecodingMultimap}}). So if you don't pass in an encoded value, it
can get decoded into the wrong string. E.g., consider the following code:
{code}
HttpRequest request = HttpRequest.builder().method("GET")
.endpoint("http://foo.com")
.addQueryParam("foo", "bar+baz")
.build();
System.out.println(request.getRequestLine());
# Prints
GET http://foo.com?foo=bar%20baz HTTP/1.1
# Note the %20 above. '+' should actually be encoded as '%2B'
# This does the right thing
HttpRequest request = HttpRequest.builder().method("GET")
.endpoint("http://foo.com")
.addQueryParam("foo", URLEncoder.encode("bar+baz", "UTF-8"))
.build();
System.out.println(request.getRequestLine());
# Prints
GET http://foo.com?foo=bar%2Bbaz HTTP/1.1
{code}
h3. Query parameter encoding depends on the order in which they are added
It seems like the _expected_ behavior is that a plus '+' gets encoded
differently depending on whether or not the query parameter in which it appears
is the last parameter (see
{{HttpRequestTest.testAddBase64AndUrlEncodedQueryParams}}). Compare:
{code}
HttpRequest request1 = HttpRequest.builder().method("GET")
.endpoint("http://foo.com")
.addQueryParam("foo", value)
.addQueryParam("a", "b")
.build();
HttpRequest request2 = HttpRequest.builder().method("GET")
.endpoint("http://foo.com")
.addQueryParam("a", "b")
.addQueryParam("foo", value)
.build();
System.out.println(request1.getRequestLine());
System.out.println(request2.getRequestLine());
# Prints
GET http://foo.com?foo=bar%20baz&a=b HTTP/1.1
GET http://foo.com?a=b&foo=bar%2Bbaz HTTP/1.1
# Notice the %20 vs %2B above
{code}
h3. {{Strings2.urlEncode}} and {{Strings2.urlDecode}} don't have symmetric
implementations
{{Strings2.urlEncode}} calls {{URLEncoder.encode}} and then additionally
substitutes '+' and '*' in order to be more browser friendly.
{{Strings2.urlDecode}} on the other hand simply wraps {{URLDecoder.decode}}.
While this is fine from a functionality standpoint, having a symmetric inverse
implementation is easier to reason about and would guard us against future bugs
in {{URLEncoder}}. At the minimum {{Strings2.urlEncode}} should enforce that no
'+' and '*' signs exist in the intermediate encoded string.
h3. Every call in {{HttpRequest.Builder}} creates a _new_ UriBuilder
E.g.:
{code}
# uriBuilder(endpoint) internally calls new UriBuilder
core/src/main/java/org/jclouds/http/HttpRequest.java: endpoint =
uriBuilder(endpoint).addQuery(name, values).build();
core/src/main/java/org/jclouds/http/HttpRequest.java: endpoint =
uriBuilder(endpoint).addQuery(name, values).build();
core/src/main/java/org/jclouds/http/HttpRequest.java: endpoint =
uriBuilder(endpoint).addQuery(parameters).build();
core/src/main/java/org/jclouds/http/HttpRequest.java: endpoint =
uriBuilder(endpoint).replaceQuery(name, values).build();
core/src/main/java/org/jclouds/http/HttpRequest.java: endpoint =
uriBuilder(endpoint).replaceQuery(name, values).build();
core/src/main/java/org/jclouds/http/HttpRequest.java: endpoint =
uriBuilder(endpoint).replaceQuery(parameters).build();
core/src/main/java/org/jclouds/http/HttpRequest.java: endpoint =
uriBuilder(endpoint).path(path).build();
{code}
Given that jclouds is largely driven by HTTP requests, this is clearly
sub-optimal.
I'll add more issues as I find them.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira