[
https://issues.apache.org/jira/browse/CODEC-317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18058699#comment-18058699
]
Shalu Jha edited comment on CODEC-317 at 2/15/26 6:36 AM:
----------------------------------------------------------
Hello [~ggregory]
I was looking into this issue. I have a query regarding the same :-
How should 0 be treated in this implementation?
Should 0 act as a separator between identical digits (not appended, but still
updates lastCode), which would preserve both digits in patterns like ...6 0
6... (hoffmann -> 0366)?
Or should 0 not act as a separator (not appended and does not update lastCode),
which would collapse those duplicates (hoffmann -> 036, and müleler -> 657)?
I want to confirm the expected behavior before finalizing the fix.
Thank You!
was (Author: JIRAUSER312459):
Hello [~ggregory]
I was looking into this issue. I have a query on this :-
How should 0 be treated in this implementation?
Should 0 act as a separator between identical digits (not appended, but still
updates lastCode), which would preserve both digits in patterns like ...6 0
6... (hoffmann -> 0366)?
Or should 0 not act as a separator (not appended and does not update lastCode),
which would collapse those duplicates (hoffmann -> 036, and müleler -> 657)?
I want to confirm the expected behavior before finalizing the fix.
Thank You!
> ColognePhonetic: Duplicate code in some cases
> ---------------------------------------------
>
> Key: CODEC-317
> URL: https://issues.apache.org/jira/browse/CODEC-317
> Project: Commons Codec
> Issue Type: Bug
> Affects Versions: 1.15, 1.16.1
> Reporter: DRUser123
> Priority: Major
>
> h2. ColognePhonetic: Duplicate code in some cases
> When the character "H" or an intermediate vowel (not at the beginning of the
> string) is intercepted, the code should not be added to the output; however,
> the lastCode variable takes the value of the latter, and this generates a
> duplicate code recognition error.
> The piece of code in question is
> *ColognePhonetic$CologneOutputBuffer.put(code) line 275 version 1.16.1
> (tested also with 1.15).*
> {+}Example with Müller (correctly coded){+}:
> Char = 'M', code = 6, lastCode = null, output = '6'
> Char = 'U', code = 0, lastCode = 6, output = '6' (no intermediate zeros are
> added)
> Char = 'L', code = 5, lastCode = 0, output = '65'
> Char = 'L', code = 5, lastCode = 5, output = '65' (no duplicate codes are
> added)
> Char = 'E', code = 0, lastCode = 5, output = '65' (no intermediate zeros are
> added)
> Char = 'R', code = 7, lastCode = 0, output = '657'
> {+}Example with Mülhler (incorrectly coded){+}:
> Char = 'M', code = 6, lastCode = null, output = '6'
> Char = 'U', code = 0, lastCode = 6, output = '6' (no intermediate zeros are
> added)
> Char = 'L', code = 5, lastCode = 0, output = '65'
> Char = 'H', code = -, lastCode = 5, output = '65'
> Char = 'L', {*}code = 5, lastCode = -{*}, output = '655' ({*}Fails to
> identify duplicate code{*})
> Char = 'E', code = 0, lastCode = 5, output = '655' (No intermediate zeros are
> added)
> Char = 'R', code = 7, lastCode = 0, output = '6557'
> {+}Example with Müleler (incorrectly coded){+}:
> Char = 'M', code = 6, lastCode = null, output = '6'
> Char = 'U', code = 0, lastCode = 6, output = '6' (no intermediate zeros are
> added)
> Char = 'L', code = 5, lastCode = 0, output = '65'
> Char = 'E', code = 0, lastCode = 5, output = '65' (no intermediate zeros are
> added)
> Char = 'L', {*}code = 5, lastCode = 0{*}, output = '655' ({*}Fails to
> identify duplicate code{*})
> Char = 'E', code = 0, lastCode = 5, output = '655' (no intermediate zeros are
> added)
> Char = 'R', code = 7, lastCode = 0, output = '6557'
--
This message was sent by Atlassian Jira
(v8.20.10#820010)