DRUser123 created CODEC-317:
-------------------------------
Summary: ColognePhonetic: Duplicate code in some cases
Key: CODEC-317
URL: https://issues.apache.org/jira/browse/CODEC-317
Project: Commons Codec
Issue Type: Bug
Affects Versions: 1.16.1, 1.15
Reporter: DRUser123
h2. ColognePhonetic: Duplicate code in some cases
When the character "H" or an intermediate vowel (not at the beginning of the
string) is intercepted, the code should not be added to the output; however,
the lastCode variable takes the value of the latter, and this generates a
duplicate code recognition error.
The piece of code in question is *ColognePhonetic$CologneOutputBuffer.put(code)
line 275 version 1.16.1 (tested also with 1.15).*
{+}Example with Müller (correctly coded){+}:
Char = 'M', code = 6, lastCode = null, output = '6'
Char = 'U', code = 0, lastCode = 6, output = '6' (no intermediate zeros are
added)
Char = 'L', code = 5, lastCode = 0, output = '65'
Char = 'L', code = 5, lastCode = 5, output = '65' (no duplicate codes are added)
Char = 'E', code = 0, lastCode = 5, output = '65' (no intermediate zeros are
added)
Char = 'R', code = 7, lastCode = 0, output = '657'
{+}Example with Mülhler (incorrectly coded){+}:
Char = 'M', code = 6, lastCode = null, output = '6'
Char = 'U', code = 0, lastCode = 6, output = '6' (no intermediate zeros are
added)
Char = 'L', code = 5, lastCode = 0, output = '65'
Char = 'H', code = -, lastCode = 5, output = '65'
Char = 'L', {*}code = 5, lastCode = -{*}, output = '655' ({*}Fails to identify
duplicate code{*})
Char = 'E', code = 0, lastCode = 5, output = '655' (No intermediate zeros are
added)
Char = 'R', code = 7, lastCode = 0, output = '6557'
{+}Example with Müleler (incorrectly coded){+}:
Char = 'M', code = 6, lastCode = null, output = '6'
Char = 'U', code = 0, lastCode = 6, output = '6' (no intermediate zeros are
added)
Char = 'L', code = 5, lastCode = 0, output = '65'
Char = 'E', code = 0, lastCode = 5, output = '65' (no intermediate zeros are
added)
Char = 'L', {*}code = 5, lastCode = 0{*}, output = '655' ({*}Fails to identify
duplicate code{*})
Char = 'E', code = 0, lastCode = 5, output = '655' (no intermediate zeros are
added)
Char = 'R', code = 7, lastCode = 0, output = '6557'
--
This message was sent by Atlassian Jira
(v8.20.10#820010)