On Mon, 2007-06-11 at 17:27 +0800, Feng Jiang wrote: > Hi all, > > I think the implementation of HttpMethodParams#getHttpElementCharset() has a > problem. In default, httpclient will choose US-ASCII as the charset to > decode the http element, such as some headers. > > But I do meet some servers from which the LOCATION header is in some other > charset, such UTF8, so that the httpclient cannot handles the > redirection(in my application, i handle it by myself) correctly. For > example, one server response such a header: > > Location: http://www.abc.com/****(some chinese character)/hello/world > > The above url contains some Chinese characters in some other charset, such > as GBK. The right way of httpclient should be: 1. detect the charset of the > url. 2. decode the url in that correct charset to a java.lang.String. 3. > construct correct header instance. > > Am I right? >
Not really. The use of non-ASCII characters in HTTP head elements (such as headers or a request line) is a violation of the HTTP specification. You can explicitly override the standard charset with a non-standard one such as UTF-8 or GBK by setting the 'http.protocol.element-charset' parameter, but I do not think HttpClient should attempt to 'guess' the charset being used. For details see: http://jakarta.apache.org/commons/httpclient/charencodings.html Oleg > Feng --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
