Ben-Waters opened a new pull request, #189: URL: https://github.com/apache/commons-codec/pull/189
With the current implementation of NYSIIS, it is possible to incorrectly remove the first character from the encoding. According to the algorithm the first character of the string should be the first character of the encoding, then based on a bunch of other rules are applied to the string characters are removed. The implementation in commons-codec passes the entire string into the transcodeRemaining method which works for the most part and then afterwards, checks that there is at least 1 character before removing the final 'A' or 'S'. The problem is, if you have a word like "ASH" you will end up with a single final character of "A". Similarly with "SSH" you would have "S" and the logic will currently remove it and return a blank string when it should still return at least the first letter of the original string. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
