[
https://issues.apache.org/jira/browse/LUCENENET-466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235915#comment-13235915
]
Christopher Currens commented on LUCENENET-466:
-----------------------------------------------
Since both DIN-5007-1 and DIN-5007-2 are both valid ways of sorting they should
probably both be included as an option. DIN-5007-1 is used for words, and is
the current version of the GermanStemmer class. DIN-5007-2 is a special
sorting for lists of names (phone book sorting). Either way, I can see where
it could be beneficial to have both. Since I don't want to diverge from the
Java stemmer too much, I think it should probably just be an additional
constructor on the GermanAnalyzer class that would allow you to pass a bool if
you want to use DIN-5007-2.
For reference:
||Letter||DIN-5007-1||DIN5007-2||
|ä|a|ae|
|ö|o|oe|
|ü|u|ue|
|ß|ss|ss|
> optimisation for the GermanStemmer.vb
> --------------------------------------
>
> Key: LUCENENET-466
> URL: https://issues.apache.org/jira/browse/LUCENENET-466
> Project: Lucene.Net
> Issue Type: Improvement
> Components: Lucene.Net Contrib
> Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g, Lucene.Net 3.0.3
> Reporter: Prescott Nasser
> Priority: Minor
> Fix For: Lucene.Net 3.0.3
>
>
> I have a little optimisation for the GermanStemmer.vb (in
> Contrib.Analyzers) class. At the moment the function "Substitute"
> converts the german "Umlaute" "ä" in "a", "ö" in"o" and "ü" in "u". This
> is not the correct german translation. They must be converted to "ae",
> "oe" and "ue". So I can write the name "Björn" or "Bjoern" but not
> "Bjorn". With this optimization a user can search for "Björn" and also
> find "Bjoern".
>
> Here is the optimized code snippet:
>
> else if ( buffer[c] == 'ä' )
> {
> buffer[c] = 'a';
> buffer.Insert(c + 1, 'e');
> }
> else if ( buffer[c] == 'ö' )
> {
> buffer[c] = 'o';
> buffer.Insert(c + 1,'e');
> }
> else if ( buffer[c] == 'ü' )
> {
> buffer[c] = 'u';
> buffer.Insert(c + 1,'e');
> }
>
> Thank You
> Björn
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira