Hi,

I'm testing some minor improvements in the Morfologik speller.

They are here:
https://github.com/jaumeortola/morfologik-stemming

The most important are:
- Try all possible replacements at the same point of a word (not only the
longest one). [1]
- Apply the properties "ignore-diacritics" and "convert-case" when
searching for replacements. [2]

There are some failures with the current German LanguageTool tests. Could
you take a look, Daniel? You need to use replacements in lower-case (r rh,
rh r). Are the results reasonable?

There is a problem with the test "Batallion > Bataillon". The cause is
previous to my changes.

In the areequal() method of Morfologik Speller, the condition "if ...
isConvertingCase()" (true in German by default, as it is not defined in
de_DE.info) is inside "if ... isIgnoringDiacritics()" (false in German).
[3] So it seems that German is working as if convert-case=false not
conciously but by a programming error.

If the preferred option in German is convert-case=false, then my changes
will not affect the German tests in any way.

Regards,
Jaume


[1]
https://github.com/jaumeortola/morfologik-stemming/commit/735f1faf82c21648039bcdb796c372ebf5bab119
[2]
https://github.com/jaumeortola/morfologik-stemming/commit/bea331608606fe774b7de20d6f73f1d3aa601e7a

[3]
https://github.com/morfologik/morfologik-stemming/blob/master/morfologik-speller/src/main/java/morfologik/speller/Speller.java#L601
------------------------------------------------------------------------------
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to