[
https://issues.apache.org/jira/browse/XERCESC-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052988#comment-13052988
]
Michael Glavassevich commented on XERCESC-1967:
-----------------------------------------------
I'm not sure what was done in Xerces-C, but in my opinion reading HTTP headers
to obtain "external encoding information" is the responsibility of the
application not the XML parser. Users of Java implementations have always had
to fetch that information themselves and provide the encoding by setting it on
the InputSource (i.e. InputSource.setEncoding()). This by design. It's not a
bug.
> Xerces ignores (deletes, swallow, ignores) the UTF-8 BOM and also ignores the
> charset parameter of the HTTP content-type: header
> --------------------------------------------------------------------------------------------------------------------------------
>
> Key: XERCESC-1967
> URL: https://issues.apache.org/jira/browse/XERCESC-1967
> Project: Xerces-C++
> Issue Type: Bug
> Components: Non-Validating Parser
> Affects Versions: 3.1.1
> Environment: Mac OS X Snow Leopard (Intel).
> (http://mirrorservice.nomedia.no/apache.org//xerces/c/3/binaries/xerces-c-3.1.1-x86-macosx-gcc-4.0.tar.gz)
> And also tested the XMLmind XML editor on same platorm.
> Reporter: Leif Halvard Silli
> Assignee: Alberto Massari
> Fix For: 3.2.0
>
> Original Estimate: 4h
> Remaining Estimate: 4h
>
> [1] http://www.w3.org/mid/[email protected]
> [2] http://www.w3.org/mid/[email protected]
> It is a XML 1.0 spec vioation. well-formed violation.
> Test cases without XML declaration: http://malform.no/testing/html5/bom/
> Test cases *with* XML declartion to be added later.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]