[
https://issues.apache.org/jira/browse/LANG-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502541#comment-13502541
]
Gary Gregory commented on LANG-858:
-----------------------------------
I've added some more tests and @Ignore'd the failing ones.
escapeJava() should behave correctly, we need to see how to make that work
under the hood without loosing our current flexibility and making the whole
escaping process Java-specific.
> StringEscapeUtils.escapeJava() does not output the escaped surrogate pairs
> that is Java parsable
> ------------------------------------------------------------------------------------------------
>
> Key: LANG-858
> URL: https://issues.apache.org/jira/browse/LANG-858
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.*, lang.text.translate.*
> Affects Versions: 3.x
> Reporter: Kazuki Hamasaki
> Priority: Minor
> Labels: escaping
> Attachments: JavaUnicodeEscape.patch
>
>
> In case of Java and ECMA Script, the style of unicode escape {{'\uxxxxxx'}}
> cannot be accepted. We need to separate it into high-surrogate and
> low-surrogate.
> For example, you put the surrogate pair
> {code:java}
> '\uDBFF\uDFFD'
> {code}
> output must be
> {code:java}
> "\\uDBFF\\uDFFD"
> {code}
> However you get
> {code:java}
> "\\u10FFFD"
> {code}
> Test case here:
> {code:java}
> @Test
> public void testEscapeSurrogatePairs() throws Exception {
> assertEquals("\\uDBFF\\uDFFD",
> StringEscapeUtils.escapeJava("\uDBFF\uDFFD"));
> assertEquals("\\uDBFF\\uDFFD",
> StringEscapeUtils.escapeEcmaScript("\uDBFF\uDFFD"));
> }
> {code}
> I attached the patch which implements simple solution.
> But UnicodeEscaper.java should not be specified for Java, I think. We need to
> discuss about it.
> This issue does not be appeared in unescape method.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira