[jira] [Commented] (COCOON-2352) XMLEncoder doesn't support Unicode surrogate pairs
[ https://issues.apache.org/jira/browse/COCOON-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495081#comment-15495081 ] Ben Fortuna commented on COCOON-2352: - Hi Francesco, The JAR I am using is: org.apache.cocoon:cocoon-serializers-charsets:1.0.2 - which appears to be built in 2012. It looks like it came from the BRANCH_2_1.X branch but I can't be certain.. I will try to make a patch - the easiest for me would a pull request on GitHub, but if you prefer a patch file I can do that also. I am looking at the unit tests in the project and it is a little difficult to get my head around. Would you prefer that I write a unit test using htmlunit, or junit, or no preference? It appears tests haven't been updated for a number of years. Many thanks. > XMLEncoder doesn't support Unicode surrogate pairs > -- > > Key: COCOON-2352 > URL: https://issues.apache.org/jira/browse/COCOON-2352 > Project: Cocoon > Issue Type: Bug > Components: * Cocoon Core >Reporter: Ben Fortuna > > Whilst investigating an issue with the Sling project and support for emoji > characters, I've come to notice that the XMLEncoder used by HTMLSerializer > doesn't support Unicode surrogate pairs to represent higher order unicode > characters. > A simple unit test that demonstrates this issue is here: > https://github.com/micronode/whistlepost/blob/master/whistlepost-rewrite-lib/src/test/groovy/org/apache/cocoon/components/serializers/encoding/XMLEncoderTest.groovy > More background info here also: SLING-5973 > This seems to have been identified/addressed in other Apache projects also: > https://issues.apache.org/jira/browse/THRIFT-3403?jql=text%20~%20%22surrogate%20pairs%22 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (COCOON-2352) XMLEncoder doesn't support Unicode surrogate pairs
[ https://issues.apache.org/jira/browse/COCOON-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492633#comment-15492633 ] Francesco Chicchiriccò commented on COCOON-2352: Hi Ben, thanks for reporting. Just for confirmation: is this bug identified against Cocoon 2.1? Also with latest development version available at [1]? (svn checkout from [2]). Are you willing to provide a patch (possibly including an unit test)? [1] http://svn.apache.org/repos/asf/cocoon/branches/BRANCH_2_1_X/src/blocks/serializers/java/org/apache/cocoon/components/serializers/encoding/XMLEncoder.java [2] http://svn.apache.org/repos/asf/cocoon/branches/BRANCH_2_1_X/ > XMLEncoder doesn't support Unicode surrogate pairs > -- > > Key: COCOON-2352 > URL: https://issues.apache.org/jira/browse/COCOON-2352 > Project: Cocoon > Issue Type: Bug > Components: * Cocoon Core >Reporter: Ben Fortuna > > Whilst investigating an issue with the Sling project and support for emoji > characters, I've come to notice that the XMLEncoder used by HTMLSerializer > doesn't support Unicode surrogate pairs to represent higher order unicode > characters. > A simple unit test that demonstrates this issue is here: > https://github.com/micronode/whistlepost/blob/master/whistlepost-rewrite-lib/src/test/groovy/org/apache/cocoon/components/serializers/encoding/XMLEncoderTest.groovy > More background info here also: SLING-5973 > This seems to have been identified/addressed in other Apache projects also: > https://issues.apache.org/jira/browse/THRIFT-3403?jql=text%20~%20%22surrogate%20pairs%22 -- This message was sent by Atlassian JIRA (v6.3.4#6332)