Johan Compagner created HTTPCLIENT-2244:
-------------------------------------------

             Summary: default response encoding is US.ASCII but it should be 
ISO-8859-1
                 Key: HTTPCLIENT-2244
                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-2244
             Project: HttpComponents HttpClient
          Issue Type: Bug
          Components: HttpClient (async)
    Affects Versions: 5.1.3
            Reporter: Johan Compagner


here (and in the getBodyBytes() above ):

[httpcomponents-client/SimpleBody.java at master · apache/httpcomponents-client 
(github.com)|https://github.com/apache/httpcomponents-client/blob/master/httpclient5/src/main/java/org/apache/hc/client5/http/async/methods/SimpleBody.java#L86]

 

you see that the default charset to read a body is StandardCharsets.US_ASCII 
which is not correct that should be StandardCharsets.ISO_8859_1

 

this was the case in HC 4.x also described in the docs: 
[https://hc.apache.org/httpclient-legacy/charencodings.html]

 

i know the server should specify the encoding because of the different 
interpretations, but we don't always control the server and what they are 
doing, but according to the spec: 

[https://www.w3.org/International/articles/http-charset/index]

the default should be that ISO_8859_1

Now clients of us that are using this suddenly see that german umlauts are not 
transferred correctly which with HC4 they worked fine.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to