[
https://issues.apache.org/jira/browse/LANG-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502298#comment-13502298
]
Gary Gregory commented on LANG-858:
-----------------------------------
Ah, I see, you mean UnicodeEscaper should not be _specific_ or _tied_ to Java
because Java requires a surrogate pair and not just one value.
> StringEscapeUtils.escapeJava() does not output the escaped surrogate pairs
> that is Java parsable
> ------------------------------------------------------------------------------------------------
>
> Key: LANG-858
> URL: https://issues.apache.org/jira/browse/LANG-858
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.*, lang.text.translate.*
> Affects Versions: 3.x
> Reporter: Kazuki Hamasaki
> Priority: Minor
> Labels: escaping
> Attachments: JavaUnicodeEscape.patch
>
>
> In case of Java and ECMA Script, the style of unicode escape {{'\uxxxxxx'}}
> cannot be accepted. We need to separate it into high-surrogate and
> low-surrogate.
> For example, you put the surrogate pair
> {code:java}
> '\uDBFF\uDFFD'
> {code}
> output must be
> {code:java}
> "\\uDBFF\\uDFFD"
> {code}
> However you get
> {code:java}
> "\\u10FFFD"
> {code}
> Test case here:
> {code:java}
> @Test
> public void testEscapeSurrogatePairs() throws Exception {
> assertEquals("\\uDBFF\\uDFFD",
> StringEscapeUtils.escapeJava("\uDBFF\uDFFD"));
> assertEquals("\\uDBFF\\uDFFD",
> StringEscapeUtils.escapeEcmaScript("\uDBFF\uDFFD"));
> }
> {code}
> I attached the patch which implements simple solution.
> But UnicodeEscaper.java should not be specified for Java, I think. We need to
> discuss about it.
> This issue does not be appeared in unescape method.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira