Dominik Strecker created LANG-1343:
--------------------------------------
Summary: StringUtils#abbreviate breaks up surrogate pairs
Key: LANG-1343
URL: https://issues.apache.org/jira/browse/LANG-1343
Project: Commons Lang
Issue Type: Bug
Components: lang.*
Affects Versions: 3.6
Reporter: Dominik Strecker
Priority: Minor
If the last char in the remaining substring is the first char of a surrogate
pair, the resulting string has an illegal surrogate pair with the second char
of the surrogate pair being the first char of the ellipsis.
{code:java}
StringUtils.abbreviate("\uD83D\uDCA9\uD83D\uDCA9\uD83D\uDCA9", 4); // returns
"\uD83D..."
{code}
In my case this breaks further along when the string is transformed to UTF-8
for a SOAP request.
Should this at least be mentioned in the Javadoc?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)