[ 
https://issues.apache.org/jira/browse/CODEC-250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629965#comment-16629965
 ] 

Sebb commented on CODEC-250:
----------------------------

I agree that the code could/should be re-written.
Does it makes sense to to try to do the encoding in a single pass?
It makes the code harder to follow and maintain, as evidenced by this bug 
report.

It could certainly do with some more comments as to the meaning of some of the 
variables and the special values they may take.

> Wrong value calculated by Cologne Phonetic if a special character is placed 
> between equal letters
> -------------------------------------------------------------------------------------------------
>
>                 Key: CODEC-250
>                 URL: https://issues.apache.org/jira/browse/CODEC-250
>             Project: Commons Codec
>          Issue Type: Bug
>    Affects Versions: 1.5, 1.11
>            Reporter: Alex Volodko
>            Priority: Major
>
> The algorith for cologne phonetic is (simpilied):
>  # Encode letter by letter from left to right according to the conversion 
> table.
>  # Remove all digits occurring more than once next to each other.
>  # Remove all code "0" except at the beginning.
> Characters which are not specified in conversion table (such as hyphens) are 
> ignored. See https://en.wikipedia.org/wiki/Cologne_phonetics
> If the input is "test-test" the step results will be:
>  # 20822082
>  # 2082082
>  # 28282
> The expected result for "test-test" is therefor 28282.
> The actual result for "test-test" is 282{color:#FF0000}2{color}82.
> This bug is caused by the fix from
> [https://github.com/apache/commons-codec/commit/72c8759a22c6552a2dfcdf61b29729f981752879]
> and is present since 1.5



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to