[
https://issues.apache.org/jira/browse/CODEC-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570928#comment-16570928
]
Gary Gregory commented on CODEC-248:
------------------------------------
Can you please paste the email exchange here FTR?
> language.DaitchMokotoffSoundex gives overly broad results for tokens
> containing RS
> ----------------------------------------------------------------------------------
>
> Key: CODEC-248
> URL: https://issues.apache.org/jira/browse/CODEC-248
> Project: Commons Codec
> Issue Type: Bug
> Reporter: Ben Kazez
> Priority: Minor
>
> I am using Apache commons codec in Elasticsearch (via Lucene).
> # GIERSZLIK codes to 548500 or 594850
> # GOTSALK codes to 548500
> # These names don't sound alike, but the matching codes means a search for
> one returns the other.
> Solution: I exchanged emails with Gary Mokotoff, co-creator of the algorithm,
> who said:
> {quote}I would drop RS from the table. ... I cannot think of any language
> where RS is pronounced "S" (4).{quote}
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)