Sheung Chi Chan created CODEC-311:
-------------------------------------

             Summary: RefinedSoundex.getMappingCode throw 
ArrayIndexOutOfBoundException
                 Key: CODEC-311
                 URL: https://issues.apache.org/jira/browse/CODEC-311
             Project: Commons Codec
          Issue Type: Bug
            Reporter: Sheung Chi Chan


The {{getMappingCode(char)}} method takes in a random character retrieved from 
a string (through processing of {{encode(String)}} or {{soundex(String)}} 
method) and checks if it is a letter, then returns a mapping code from the 
{{soundexMapping}} array if it is a letter. But the checking contains a bug. 
The {{Character.isLetter()}} method will return true not only for English 
characters. For example, a char with character code 1689 will also make 
{{Character.isLetter()}} returns true. Using a character with large character 
code that passed the {{Character.isLetter()}} check and a way smaller 
{{soundexMapping}} array will cause ArrayIndexOutOfBoundException.

This possible fixes could add a conditional checking to ensure the index is 
never out of bounds from the configured {{soundexMapping}} array. If the 
calculated index goes out of bounds, it will simply return 0, just like the 
original logic when Character.isLetter() returns false.

We found this bug using fuzzing by way of OSS-Fuzz. It is reported at 
[https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=64353].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to