Ben Fortuna commented on COCOON-2352:

Hmm, do you have a link to the source? I checked on BRANCH_2_1_X and it still 
has the old code. I noticed the error is on line 42, but the test I submitted 
only has 33 lines. 

Note it is important for the test to encode the surrogate pairs together, which 
is why I had the sequence like this:

char[] expectedValue = encoder.encode((char) 127808);
// surrogate 1/2
assertTrue(encoder.encode('\uD83C').length == 0);
// surrogate 2/2
assertTrue(Arrays.equals(expectedValue, encoder.encode('\uDF40')));

> XMLEncoder doesn't support Unicode surrogate pairs
> --------------------------------------------------
>                 Key: COCOON-2352
>                 URL: https://issues.apache.org/jira/browse/COCOON-2352
>             Project: Cocoon
>          Issue Type: Bug
>          Components: * Cocoon Core, Blocks: Serializers
>    Affects Versions: 2.1.12
>            Reporter: Ben Fortuna
>            Assignee: Francesco Chicchiriccò
>             Fix For: 2.1.13
> Whilst investigating an issue with the Sling project and support for emoji 
> characters, I've come to notice that the XMLEncoder used by HTMLSerializer 
> doesn't support Unicode surrogate pairs to represent higher order unicode 
> characters.
> A simple unit test that demonstrates this issue is here:
> https://github.com/micronode/whistlepost/blob/master/whistlepost-rewrite-lib/src/test/groovy/org/apache/cocoon/components/serializers/encoding/XMLEncoderTest.groovy
> More background info here also: SLING-5973
> This seems to have been identified/addressed in other Apache projects also:
> https://issues.apache.org/jira/browse/THRIFT-3403?jql=text%20~%20%22surrogate%20pairs%22

This message was sent by Atlassian JIRA

Reply via email to