[lang3] StringUtils does not handle supplementary characters correctly

Jason Pickens Tue, 06 Aug 2019 02:13:47 -0700

Hi,

I was just wondering whether StringUtils should be handling Unicode
supplementary characters correctly?


For example org.apache.commons.lang3.StringUtils#isAlphanumeric will return
false for code point 65536 which is actually a letter. This is because it
uses java.lang.CharSequence#charAt rather
than java.lang.CharSequence#codePoints. The former will only return the
high-surrogate code unit if that code point is a supplementary code point.


Cheers,

Jason

[lang3] StringUtils does not handle supplementary characters correctly

Reply via email to