[ 
https://issues.apache.org/jira/browse/CODEC-317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18058699#comment-18058699
 ] 

Shalu Jha commented on CODEC-317:
---------------------------------

Hello [~ggregory] 

I was looking into this issue. I have a query on this :- 

How should 0 be treated in this implementation?

Should 0 act as a separator between identical digits (not appended, but still 
updates lastCode), which would preserve both digits in patterns like ...6 0 
6... (hoffmann -> 0366)?
Or should 0 not act as a separator (not appended and does not update lastCode), 
which would collapse those duplicates (hoffmann -> 036, and müleler -> 657)?

I want to confirm the expected behavior before finalizing the fix.

Thank You! 

> ColognePhonetic: Duplicate code in some cases
> ---------------------------------------------
>
>                 Key: CODEC-317
>                 URL: https://issues.apache.org/jira/browse/CODEC-317
>             Project: Commons Codec
>          Issue Type: Bug
>    Affects Versions: 1.15, 1.16.1
>            Reporter: DRUser123
>            Priority: Major
>
> h2. ColognePhonetic: Duplicate code in some cases
> When the character "H" or an intermediate vowel (not at the beginning of the 
> string) is intercepted, the code should not be added to the output; however, 
> the lastCode variable takes the value of the latter, and this generates a 
> duplicate code recognition error. 
> The piece of code in question is 
> *ColognePhonetic$CologneOutputBuffer.put(code) line 275 version 1.16.1 
> (tested also with 1.15).*
> {+}Example with Müller (correctly coded){+}:
> Char = 'M', code = 6, lastCode = null, output = '6'
> Char = 'U', code = 0, lastCode = 6, output = '6' (no intermediate zeros are 
> added)
> Char = 'L', code = 5, lastCode = 0, output = '65'   
> Char = 'L', code = 5, lastCode = 5, output = '65' (no duplicate codes are 
> added)
> Char = 'E', code = 0, lastCode = 5, output = '65' (no intermediate zeros are 
> added)
> Char = 'R', code = 7, lastCode = 0, output = '657' 
> {+}Example with Mülhler (incorrectly coded){+}:
> Char = 'M', code = 6, lastCode = null, output = '6'
> Char = 'U', code = 0, lastCode = 6, output = '6' (no intermediate zeros are 
> added)
> Char = 'L', code = 5, lastCode = 0, output = '65'   
> Char = 'H', code = -, lastCode = 5, output = '65' 
> Char = 'L', {*}code = 5, lastCode = -{*}, output = '655' ({*}Fails to 
> identify duplicate code{*})
> Char = 'E', code = 0, lastCode = 5, output = '655' (No intermediate zeros are 
> added)
> Char = 'R', code = 7, lastCode = 0, output = '6557' 
> {+}Example with Müleler (incorrectly coded){+}:
> Char = 'M', code = 6, lastCode = null, output = '6'
> Char = 'U', code = 0, lastCode = 6, output = '6' (no intermediate zeros are 
> added)
> Char = 'L', code = 5, lastCode = 0, output = '65'   
> Char = 'E', code = 0, lastCode = 5, output = '65' (no intermediate zeros are 
> added)
> Char = 'L', {*}code = 5, lastCode = 0{*}, output = '655' ({*}Fails to 
> identify duplicate code{*})
> Char = 'E', code = 0, lastCode = 5, output = '655' (no intermediate zeros are 
> added)
> Char = 'R', code = 7, lastCode = 0, output = '6557' 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to