Sheung Chi Chan created CODEC-311:
-------------------------------------
Summary: RefinedSoundex.getMappingCode throw
ArrayIndexOutOfBoundException
Key: CODEC-311
URL: https://issues.apache.org/jira/browse/CODEC-311
Project: Commons Codec
Issue Type: Bug
Reporter: Sheung Chi Chan
The {{getMappingCode(char)}} method takes in a random character retrieved from
a string (through processing of {{encode(String)}} or {{soundex(String)}}
method) and checks if it is a letter, then returns a mapping code from the
{{soundexMapping}} array if it is a letter. But the checking contains a bug.
The {{Character.isLetter()}} method will return true not only for English
characters. For example, a char with character code 1689 will also make
{{Character.isLetter()}} returns true. Using a character with large character
code that passed the {{Character.isLetter()}} check and a way smaller
{{soundexMapping}} array will cause ArrayIndexOutOfBoundException.
This possible fixes could add a conditional checking to ensure the index is
never out of bounds from the configured {{soundexMapping}} array. If the
calculated index goes out of bounds, it will simply return 0, just like the
original logic when Character.isLetter() returns false.
We found this bug using fuzzing by way of OSS-Fuzz. It is reported at
[https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=64353].
--
This message was sent by Atlassian Jira
(v8.20.10#820010)