[
https://issues.apache.org/jira/browse/SOLR-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13812912#comment-13812912
]
Thomas Champagne commented on SOLR-2982:
----------------------------------------
I noticed too the bad performance of Beider Morse encoder. So, I have created
an issue CODEC-174 in the commons-codec project to improve the performance.
Currently, I have created two patches that allow dividing the encoding time by
2.
If you want a better Beider Morse encoder, you can join us on the issue
CODEC-174 :)
> Upgrade Apache Commons Codec to version 1.6 in order to add new Beider-Morse
> Phonetic Matching (BMPM) option
> ------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-2982
> URL: https://issues.apache.org/jira/browse/SOLR-2982
> Project: Solr
> Issue Type: Improvement
> Components: Rules, Schema and Analysis, search
> Reporter: Brooke Schreier Ganz
> Labels: codec, commons, commons-codec, language, names,
> phonetic, search, searching, soundalike
> Fix For: 3.6, 4.0-ALPHA
>
> Attachments: SOLR-2982.patch
>
>
> Apache Commons Codec released version 1.6 of their codec pack in November,
> 2011. Along with a few bug fixes, 1.6 contains a great new phonetic matching
> system called Beider-Morse Phonetic Matching (BMPM) that is far superior to
> the existing phonetic codecs, such as regular soundex, metaphone, caverphone,
> and so on. BMPM has actually been available for some time, but this is the
> first port of it to java, and its first commit in the Apache ecosystem.
> For a lot more information, see here: http://stevemorse.org/phoneticinfo.htm
> and http://stevemorse.org/phonetics/bmpm.htm
> BMPM would be a fantastic "soundalike" tool to help search for personal names
> (or just surnames) in a Solr/Lucene index, much better than Levenshtein
> distance for this use case.
--
This message was sent by Atlassian JIRA
(v6.1#6144)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]