alhudz opened a new pull request, #1722:
URL: https://github.com/apache/commons-lang/pull/1722

   Repro: `WordUtils.initials("Ben 😀mile Lee")` where the second word begins 
with U+1F600.
   Cause: the loop copies the first `char` after a delimiter and skips the rest 
of the word, so a word that starts with a supplementary code point keeps only 
the high surrogate and the low half is dropped, leaving a lone surrogate in the 
result (`B` + U+D83D + `L`).
   Fix: copy the trailing low surrogate together with its high half, and size 
the buffer to the input length so a two-`char` initial cannot run past it. BMP 
input is unchanged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to