Kazuki Hamasaki created LANG-858:
------------------------------------
Summary: StringEscapeUtils.escapeJava() does not output the
escaped surrogate pairs that is Java parsable
Key: LANG-858
URL: https://issues.apache.org/jira/browse/LANG-858
Project: Commons Lang
Issue Type: Bug
Components: lang.*, lang.text.translate.*
Affects Versions: 3.x
Reporter: Kazuki Hamasaki
Priority: Minor
Attachments: JavaUnicodeEscape.patch
In case of Java and ECMA Script, the style of unicode escape {{'\uxxxxxx'}}
cannot be accepted. We need to separate it into high-surrogate and
low-surrogate.
For example, you put the surrogate pair
{code:java}
'\uDBFF\uDFFD'
{code}
output must be
{code:java}
"\\uDBFF\\uDFFD"
{code}
However you get
{code:java}
"\\u10FFFD"
{code}
Test case here:
{code:java}
@Test
public void testEscapeSurrogatePairs() throws Exception {
assertEquals("\\uDBFF\\uDFFD",
StringEscapeUtils.escapeJava("\uDBFF\uDFFD"));
assertEquals("\\uDBFF\\uDFFD",
StringEscapeUtils.escapeEcmaScript("\uDBFF\uDFFD"));
}
{code}
I attached the patch which implements simple solution.
But UnicodeEscaper.java should not be specified for Java, I think. We need to
discuss about it.
This issue does not be appeared in unescape method.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira