Thibaut created HTTPCLIENT-1257:
-----------------------------------
Summary: Header location automatically converted to ASCII even
though location can contain UTF-8 encoded urls
Key: HTTPCLIENT-1257
URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1257
Project: HttpComponents HttpClient
Issue Type: Bug
Components: HttpClient
Affects Versions: 4.2.2
Reporter: Thibaut
I'm trying to fetch:
http://handheld.vn/content.php?4052-Đánh-giá-máy-tính-bảng-Kindle-Fire-HD-7-inch
Which returns:
2012-10-29 18:54:29,355 DEBUG http.wire: << "HTTP/1.1 303 See Other[\r][\n]"
[main]
2012-10-29 18:54:29,355 DEBUG http.wire: << "Date: Mon, 29 Oct 2012 17:55:57
GMT[\r][\n]" [main]
2012-10-29 18:54:29,355 DEBUG http.wire: << "Server: Apache[\r][\n]" [main]
2012-10-29 18:54:29,355 DEBUG http.wire: << "Expires: Thu, 19 Nov 1981 08:52:00
GMT[\r][\n]" [main]
2012-10-29 18:54:29,356 DEBUG http.wire: << "Cache-Control: no-store, no-cache,
must-revalidate, post-check=0, pre-check=0[\r][\n]" [main]
2012-10-29 18:54:29,356 DEBUG http.wire: << "Pragma: no-cache[\r][\n]" [main]
2012-10-29 18:54:29,356 DEBUG http.wire: << "Set-Cookie: bb_lastactivity=0;
expires=Tue, 29-Oct-2013 17:55:57 GMT; path=/[\r][\n]" [main]
2012-10-29 18:54:29,356 DEBUG http.wire: << "Location:
http://handheld.vn/content/4052-????nh-gi??-m??y-t??nh-b???ng-Kindle-Fire-HD-7-inch[\r][\n]"
[main]
2012-10-29 18:54:29,357 DEBUG http.wire: << "Content-Length: 0[\r][\n]" [main]
2012-10-29 18:54:29,357 DEBUG http.wire: << "Connection: close[\r][\n]" [main]
2012-10-29 18:54:29,357 DEBUG http.wire: << "Content-Type: text/html[\r][\n]"
[main]
2012-10-29 18:54:29,357 DEBUG http.wire: << "[\r][\n]" [main]
2012-10-29 18:54:29,357 DEBUG conn.DefaultClientConnection: Receiving response:
HTTP/1.1 303 See Other [main]
2012-10-29 18:54:29,357 DEBUG http.headers: << HTTP/1.1 303 See Other [main]
2012-10-29 18:54:29,358 DEBUG http.headers: << Date: Mon, 29 Oct 2012 17:55:57
GMT [main]
2012-10-29 18:54:29,358 DEBUG http.headers: << Server: Apache [main]
2012-10-29 18:54:29,358 DEBUG http.headers: << Expires: Thu, 19 Nov 1981
08:52:00 GMT [main]
2012-10-29 18:54:29,358 DEBUG http.headers: << Cache-Control: no-store,
no-cache, must-revalidate, post-check=0, pre-check=0 [main]
2012-10-29 18:54:29,358 DEBUG http.headers: << Pragma: no-cache [main]
2012-10-29 18:54:29,358 DEBUG http.headers: << Set-Cookie: bb_lastactivity=0;
expires=Tue, 29-Oct-2013 17:55:57 GMT; path=/ [main]
2012-10-29 18:54:29,358 DEBUG http.headers: << Location:
http://handheld.vn/content/4052-Äánh-giá-máy-tÃnh-bảng-Kindle-Fire-HD-7-inch
[main]
2012-10-29 18:54:29,358 DEBUG http.headers: << Content-Length: 0 [main]
2012-10-29 18:54:29,358 DEBUG http.headers: << Connection: close [main]
2012-10-29 18:54:29,359 DEBUG http.headers: << Content-Type: text/html [main]
Unfortunately I can't get the resolve Url through the following code:
Header locationHeader = response.getFirstHeader("location");
which will return
http://handheld.vn/content/4052-Äánh-giá-máy-tÃnh-bảng-Kindle-Fire-HD-7-inch
The header has already been extracted in the wrong content encoding. I will
never be able to get the redirect url!
I understand that this is not RFC normalised behavior, but the above url and
redirect works fine in all browsers.
Is it possible to access the raw header (byte array) so that I can chose the
encoding on my own? This would help a lot. Or a parameter to optionally specify
the encoding when fetching a header value.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]