Hello Andrew,

sorry that my mail yestarday took 9 hours to get to the list.
I hope this one appears in a timely manner :-)


"Andrew A. Sabitov" <[EMAIL PROTECTED]> wrote on 16.06.2005 
03:04:05:

> Server sends Shift_JIS as page charset. 
> 
> it's my code now:
> 
> ............
> result = new HttpResponse ( method.getResponseBodyAsStream (), 
> method.getResponseCharSet() );
> .........
> 
> //in HttpResponse constructor:
> HttpResponse ( InputStream responseBodyAsStream, String charset ) 
> throws IOException {
>    BufferedReader reader = new BufferedReader ( new 
> InputStreamReader ( responseBodyAsStream, charset ) );
>         String line = null;
>         while ( ( line = reader.readLine() ) != null ) {
>             this.add( line );
>          out.write( line );
>          out.write( "\n" );
>         }
> 
> }
> 
> It works. :)
> 
> It's funny, but http://jakarta.apache.org/commons/httpclient/3.
> 0/charencodings.html
> says: "If the response is known to be a String, you can use the 
> getResponseBodyAsString method which will automatically use the encoding 

> specified in the Content-Type header or ISO-8859-1 if no charset is 
> specified."
> 
> Content-Type for this page is "text/html; charset=Shift_JIS", I realy 
> thought that httpclient autocovert body... :( 
> 

I've checked the code for 3.0. Here are the relevant fragments:

http://svn.apache.org/repos/asf/jakarta/commons/proper/httpclient/trunk/src/java/org/apache/commons/httpclient/HttpMethodBase.java
method getResponseBodyAsString:
        byte[] rawdata...
            ... = getResponseBody()
        ...
            return EncodingUtil.getString(rawdata, getResponseCharSet())
        ...


http://svn.apache.org/repos/asf/jakarta/commons/proper/httpclient/trunk/src/java/org/apache/commons/httpclient/util/EncodingUtil.java
method getString(byte[],int,int,String):

       ...  return new String(data, offset, length, charset)
       ...  LOG.warn("Unsupported encoding: " + charset + ". System 
encoding used");
            return new String(data, offset, length);

I wonder whether the InputStreamReader recognizes charsets that the String
constructor doesn't? But why should it? And why wouldn't you get the 
warning?
Something is fishy here.

cheers,
  Roland

Reply via email to