DRUser123 created CODEC-317:
-------------------------------

             Summary: ColognePhonetic: Duplicate code in some cases
                 Key: CODEC-317
                 URL: https://issues.apache.org/jira/browse/CODEC-317
             Project: Commons Codec
          Issue Type: Bug
    Affects Versions: 1.16.1, 1.15
            Reporter: DRUser123


h2. ColognePhonetic: Duplicate code in some cases

When the character "H" or an intermediate vowel (not at the beginning of the 
string) is intercepted, the code should not be added to the output; however, 
the lastCode variable takes the value of the latter, and this generates a 
duplicate code recognition error. 

The piece of code in question is *ColognePhonetic$CologneOutputBuffer.put(code) 
line 275 version 1.16.1 (tested also with 1.15).*

{+}Example with Müller (correctly coded){+}:
Char = 'M', code = 6, lastCode = null, output = '6'
Char = 'U', code = 0, lastCode = 6, output = '6' (no intermediate zeros are 
added)
Char = 'L', code = 5, lastCode = 0, output = '65'   
Char = 'L', code = 5, lastCode = 5, output = '65' (no duplicate codes are added)
Char = 'E', code = 0, lastCode = 5, output = '65' (no intermediate zeros are 
added)
Char = 'R', code = 7, lastCode = 0, output = '657' 

{+}Example with Mülhler (incorrectly coded){+}:
Char = 'M', code = 6, lastCode = null, output = '6'
Char = 'U', code = 0, lastCode = 6, output = '6' (no intermediate zeros are 
added)
Char = 'L', code = 5, lastCode = 0, output = '65'   
Char = 'H', code = -, lastCode = 5, output = '65' 
Char = 'L', {*}code = 5, lastCode = -{*}, output = '655' ({*}Fails to identify 
duplicate code{*})
Char = 'E', code = 0, lastCode = 5, output = '655' (No intermediate zeros are 
added)
Char = 'R', code = 7, lastCode = 0, output = '6557' 

{+}Example with Müleler (incorrectly coded){+}:
Char = 'M', code = 6, lastCode = null, output = '6'
Char = 'U', code = 0, lastCode = 6, output = '6' (no intermediate zeros are 
added)
Char = 'L', code = 5, lastCode = 0, output = '65'   
Char = 'E', code = 0, lastCode = 5, output = '65' (no intermediate zeros are 
added)
Char = 'L', {*}code = 5, lastCode = 0{*}, output = '655' ({*}Fails to identify 
duplicate code{*})
Char = 'E', code = 0, lastCode = 5, output = '655' (no intermediate zeros are 
added)
Char = 'R', code = 7, lastCode = 0, output = '6557' 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to