[
https://issues.apache.org/jira/browse/XALANJ-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310888#comment-16310888
]
Thomas Scheffler commented on XALANJ-2560:
------------------------------------------
As Xalan produces invalid XML. This is a real show stopper. Sad to see, that it
is still unresolved.
> ToXMLStream does not support unicode supplementary characters
> -------------------------------------------------------------
>
> Key: XALANJ-2560
> URL: https://issues.apache.org/jira/browse/XALANJ-2560
> Project: XalanJ2
> Issue Type: Bug
> Security Level: No security risk; visible to anyone(Ordinary problems in
> Xalan projects. Anybody can view the issue.)
> Components: Serialization
> Affects Versions: 2.7.1
> Environment: Xalan 2.7.1 serializer.
> Tested on Ubuntu 12.04 with Oracle JDK 1.7.0_05.
> Reporter: Damien Guillaume
> Labels: serialization, unicode
>
> org.apache.xml.serializer.ToXMLStream (which extends ToStream) does not
> support serialization of unicode supplementary characters such as U+1D49C. It
> creates invalid characters entities like "��" instead of
> "𝒜" (or F0 9D 92 9C with UTF-8). ToXMLStream is used by LSSerializer
> when Xalan's serializer is on the classpath.
> org.apache.xml.serialize.DOMSerializerImpl (included in Xerces) does not have
> this problem, but it is deprecated since Xerces 2.9.0, so this is a
> regression.
> See
> http://stackoverflow.com/questions/11952289/serializing-supplementary-unicode-characters-into-xml-documents-with-java
> for more details.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]