[ 
https://issues.apache.org/jira/browse/COCOON-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495081#comment-15495081
 ] 

Ben Fortuna commented on COCOON-2352:
-------------------------------------

Hi Francesco,

The JAR I am using is: org.apache.cocoon:cocoon-serializers-charsets:1.0.2 - 
which appears to be built in 2012. It looks like it came from the BRANCH_2_1.X 
branch but I can't be certain..

I will try to make a patch - the easiest for me would a pull request on GitHub, 
but if you prefer a patch file I can do that also. 

I am looking at the unit tests in the project and it is a little difficult to 
get my head around. Would you prefer that I write a unit test using htmlunit, 
or junit, or no preference? It appears tests haven't been updated for a number 
of years. Many thanks.


> XMLEncoder doesn't support Unicode surrogate pairs
> --------------------------------------------------
>
>                 Key: COCOON-2352
>                 URL: https://issues.apache.org/jira/browse/COCOON-2352
>             Project: Cocoon
>          Issue Type: Bug
>          Components: * Cocoon Core
>            Reporter: Ben Fortuna
>
> Whilst investigating an issue with the Sling project and support for emoji 
> characters, I've come to notice that the XMLEncoder used by HTMLSerializer 
> doesn't support Unicode surrogate pairs to represent higher order unicode 
> characters.
> A simple unit test that demonstrates this issue is here:
> https://github.com/micronode/whistlepost/blob/master/whistlepost-rewrite-lib/src/test/groovy/org/apache/cocoon/components/serializers/encoding/XMLEncoderTest.groovy
> More background info here also: SLING-5973
> This seems to have been identified/addressed in other Apache projects also:
> https://issues.apache.org/jira/browse/THRIFT-3403?jql=text%20~%20%22surrogate%20pairs%22



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to