StringEscapeUtils.unescapeXml(str) does not support supplemental characters.
----------------------------------------------------------------------------
Key: LANG-729
URL: https://issues.apache.org/jira/browse/LANG-729
Project: Commons Lang
Issue Type: Improvement
Components: lang.*
Affects Versions: 2.6
Reporter: Taro Yabuki
Priority: Trivial
Attachments: lang_2_6_unescapexml_20110716.diff
StringEscapeUtils.unescapeXml(str) does not unescape numeric character
references of supplemental characters:
String str2 = StringEscapeUtils.unescapeXml("𣎴");
System.out.println(str2.codePointAt(0));
//38 (it means '&'.)
This output should be 144308.
Currently, StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is
equal to str, so it doesn't seem to be wrong. But, as we reported in LANG-728,
StringEscapeUtils.escapeXml(str) has a bug. When the bug is fixed,
StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) would not be
equal to str. We do not expect it. (Of course, we don't expect that
StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is always equal
to str.)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira