[
https://issues.apache.org/jira/browse/HTTPCLIENT-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160971#comment-13160971
]
Ian Beaumont commented on HTTPCLIENT-1149:
------------------------------------------
Looking at RFC-3629...the very next paragraph to the one you quote is...
o A protocol SHOULD NOT forbid use of U+FEFF as a signature for
those textual protocol elements for which the protocol does not
provide character encoding identification mechanisms, when a ban
would be unenforceable, or when it is expected that
implementations of the protocol will not be in a position to
always use the mechanisms properly. The latter two cases are
likely to occur with larger protocol elements such as MIME
entities, especially when implementations of the protocol will
obtain such entities from file systems, from protocols that do not
have encoding identification mechanisms for payloads (such as FTP)
or from other protocols that do not guarantee proper
identification of character encoding (such as HTTP).
Isn't that more relevant?
Would it be an issue to wrap the inputstream in the BOMInputStream?
> EntityUtils.toString should detect Byte order mark (BOM) and remove it if
> present
> ---------------------------------------------------------------------------------
>
> Key: HTTPCLIENT-1149
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1149
> Project: HttpComponents HttpClient
> Issue Type: Bug
> Components: HttpClient
> Affects Versions: 4.1.2
> Environment: Windows
> Reporter: Ian Beaumont
> Priority: Minor
> Labels: BOM, EntityUtils
>
> The Byte order mark at the start of the input stream should be detected and
> removed by EntityUtils.toString, otherwise strange unwanted characters are
> left at the start.
> This link lists possible Byte order markings
> http://en.wikipedia.org/wiki/Byte_order_mark
> I'm not sure if EntityUtils.toString using the BOM to try to detect the
> encoding, but if it doesn't then it should.
> Example URL that is causing this issue is mircosoft virtual earth WSDL file:
> HttpClient httpclient = new DefaultHttpClient();
> HttpGet httpget = new
> HttpGet("http://dev.virtualearth.net/webservices/v1/searchservice/searchservice.svc?wsdl");
> HttpResponse response = httpclient.execute(httpget);
> HttpEntity entity = response.getEntity();
> String textContents = EntityUtils.toString(entity);
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]