[ https://issues.apache.org/jira/browse/COCOON-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15560981#comment-15560981 ]
ASF GitHub Bot commented on COCOON-2352: ---------------------------------------- GitHub user benfortuna opened a pull request: https://github.com/apache/cocoon/pull/1 Support for Unicode surrogate pairs This PR adds support for encoding surrogate pairs as a single character the XMLEncoder implementation. See [COCOON-2352](https://issues.apache.org/jira/browse/COCOON-2352) for further details. You can merge this pull request into a Git repository by running: $ git pull https://github.com/benfortuna/cocoon BRANCH_2_1_X Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cocoon/pull/1.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1 ---- commit 4975a555b8330446089c81e17e8bfaaaee669600 Author: Ben Fortuna <benfort...@gmail.com> Date: 2016-10-10T00:11:32Z Added required folder for build commit cf2d9b65eb55b9d19a0b0c179e90fe7c7b70b6e6 Author: Ben Fortuna <benfort...@gmail.com> Date: 2016-10-10T00:11:58Z Added support for decoding surrogate pairs commit cc68b0040c5afc6286dc767810ea2ec7abd58340 Author: Ben Fortuna <benfort...@gmail.com> Date: 2016-10-10T01:26:20Z Added unit test for encoding unicode surrogate pairs ---- > XMLEncoder doesn't support Unicode surrogate pairs > -------------------------------------------------- > > Key: COCOON-2352 > URL: https://issues.apache.org/jira/browse/COCOON-2352 > Project: Cocoon > Issue Type: Bug > Components: * Cocoon Core, Blocks: Serializers > Reporter: Ben Fortuna > > Whilst investigating an issue with the Sling project and support for emoji > characters, I've come to notice that the XMLEncoder used by HTMLSerializer > doesn't support Unicode surrogate pairs to represent higher order unicode > characters. > A simple unit test that demonstrates this issue is here: > https://github.com/micronode/whistlepost/blob/master/whistlepost-rewrite-lib/src/test/groovy/org/apache/cocoon/components/serializers/encoding/XMLEncoderTest.groovy > More background info here also: SLING-5973 > This seems to have been identified/addressed in other Apache projects also: > https://issues.apache.org/jira/browse/THRIFT-3403?jql=text%20~%20%22surrogate%20pairs%22 -- This message was sent by Atlassian JIRA (v6.3.4#6332)