[
https://issues.apache.org/jira/browse/XERCESC-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046454#comment-13046454
]
Alberto Massari commented on XERCESC-1967:
------------------------------------------
In my opinion, the parser should obey only to the encoding seen in the XML
declaration, as it has no control on whatever envelope was used to transmit the
XML. It's up to the code involved in the transmission of the XML to decide how
to encode/decode the data for its communication channel; so, speaking of
Xerces, the entity resolver that reads from HTTP should read the content-type
and force this encoding in the parser. I'll check if this is working as
expected.
> Xerces ignores (deletes, swallow, ignores) the UTF-8 BOM and also ignores the
> charset parameter of the HTTP content-type: header
> --------------------------------------------------------------------------------------------------------------------------------
>
> Key: XERCESC-1967
> URL: https://issues.apache.org/jira/browse/XERCESC-1967
> Project: Xerces-C++
> Issue Type: Bug
> Components: Non-Validating Parser
> Affects Versions: 3.1.1
> Environment: Mac OS X Snow Leopard (Intel).
> (http://mirrorservice.nomedia.no/apache.org//xerces/c/3/binaries/xerces-c-3.1.1-x86-macosx-gcc-4.0.tar.gz)
> And also tested the XMLmind XML editor on same platorm.
> Reporter: Leif Halvard Silli
> Original Estimate: 4h
> Remaining Estimate: 4h
>
> [1] http://www.w3.org/mid/[email protected]
> [2] http://www.w3.org/mid/[email protected]
> It is a XML 1.0 spec vioation. well-formed violation.
> Test cases without XML declaration: http://malform.no/testing/html5/bom/
> Test cases *with* XML declartion to be added later.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]