alhudz commented on PR #1687: URL: https://github.com/apache/commons-lang/pull/1687#issuecomment-4639160153
Not endemic, from what I can tell. Most of the `String`/`char` code in Lang operates on `char`s by design and is documented that way, so it's fine. The bug class I've been hitting is narrower: a method that scans by `char` index but then exposes a *code-point* count or contract at its boundary, so a supplementary character throws the count off by one. I've found three of those seams so far, each with a reproducer: - `CharSequenceUtils.lastIndexOf` (#1684, merged) - `StringUtils.indexOfAny` (this one) - `LookupTranslator.translate` (#1691) — returned the matched key length in `char`s where the translator loop advances by `Character.charCount` The way I find them is to look for places that cross between char-indexed scanning and a code-point boundary, then build a supplementary-key case and check the count. I haven't done an exhaustive sweep of the whole class, so I can't promise these are the last three, but they're the ones I could actually reproduce rather than guess at. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
