[
https://issues.apache.org/jira/browse/HTTPCLIENT-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16536656#comment-16536656
]
ASF subversion and git services commented on HTTPCLIENT-1927:
-------------------------------------------------------------
Commit 5fea3e0439df77c91c34c5b6482ad79770b63d0c in httpcomponents-client's
branch refs/heads/4.5.x from [~olegk]
[ https://git-wip-us.apache.org/repos/asf?p=httpcomponents-client.git;h=5fea3e0
]
HTTPCLIENT-1927: URLEncodedUtils#parse breaks at double quotes when parsing
unquoted values
> URLEncodedUtils#parse breaks at double quotes when parsing unquoted values
> --------------------------------------------------------------------------
>
> Key: HTTPCLIENT-1927
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1927
> Project: HttpComponents HttpClient
> Issue Type: Bug
> Components: HttpClient (async), HttpClient (classic)
> Affects Versions: 4.5.5, 4.5.6
> Reporter: Kadeem Hassam
> Priority: Minor
>
> Assume a query string like {{a=b"c&d=e}}
> The expected mapping for that query string, would reasonably be expected to be
> {code:java}
> [a=b"c, d=e]
> {code}
> Actual result using httpcore 4.4.9 is
> {code:java}
> [a=bc&d=e]
> {code}
> Example code:
> {code:java}
> import java.nio.charset.StandardCharsets;
> import org.apache.http.client.utils.URLEncodedUtils;
> class QueryParser {
> public static void main(String[] args) {
> System.out.println(URLEncodedUtils.parse("a=b\"c&d=e",
> StandardCharsets.UTF_8, '&'));
> }
> }
> {code}
> Using {{URLEncodedUtils}} from {{httpclient}} uses the {{TokenParser}} in
> {{httpcore}}.
> After successfully parsing the name ({{a}}), the value is parsed using the
> {{parseValue(CharArrayBuffer, ParserCursor,
> BitSet)}}[[link|https://github.com/apache/httpcomponents-core/blob/4.4.x/httpcore/src/main/java/org/apache/http/message/TokenParser.java#L119-L144]]
> method.
> The first character being neither a delimiter nor a double quote, ends up
> calling {{copyUnquotedContent(CharArrayBuffer, ParserCursor, BitSet,
> StringBuilder)}}[[link|https://github.com/apache/httpcomponents-core/blob/4.4.x/httpcore/src/main/java/org/apache/http/message/TokenParser.java#L205-L221]]
> which ends up returning when the double quote is reached
> ([[link|https://github.com/apache/httpcomponents-core/blob/4.4.x/httpcore/src/main/java/org/apache/http/message/TokenParser.java#L213-L214]])
> instead of when the delimiter is reached.
> {{parseValue}} then continues parsing the value but as quoted content this
> time (because the now current position is a quote character). Copying quoted
> content reasonably does not break on the delimiter set, but this ends up
> consuming the rest of the query string.
> Other URI parsers parse the URI in the expected format, such as with Python.
> {noformat}
> Python 3.6.1 (default, Mar 23 2017, 13:04:44) [GCC] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import urllib.parse
> >>> urllib.parse.parse_qs('a=b"c&d=e')
> {'a': ['b"c'], 'd': ['e']}
> {noformat}
> Although I haven't explicitly tested with {{httpcore5}}, the code for
> {{TokenParser}} appears equivalent to {{4.4.9}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]