Hi, I'm testing some minor improvements in the Morfologik speller.
They are here: https://github.com/jaumeortola/morfologik-stemming The most important are: - Try all possible replacements at the same point of a word (not only the longest one). [1] - Apply the properties "ignore-diacritics" and "convert-case" when searching for replacements. [2] There are some failures with the current German LanguageTool tests. Could you take a look, Daniel? You need to use replacements in lower-case (r rh, rh r). Are the results reasonable? There is a problem with the test "Batallion > Bataillon". The cause is previous to my changes. In the areequal() method of Morfologik Speller, the condition "if ... isConvertingCase()" (true in German by default, as it is not defined in de_DE.info) is inside "if ... isIgnoringDiacritics()" (false in German). [3] So it seems that German is working as if convert-case=false not conciously but by a programming error. If the preferred option in German is convert-case=false, then my changes will not affect the German tests in any way. Regards, Jaume [1] https://github.com/jaumeortola/morfologik-stemming/commit/735f1faf82c21648039bcdb796c372ebf5bab119 [2] https://github.com/jaumeortola/morfologik-stemming/commit/bea331608606fe774b7de20d6f73f1d3aa601e7a [3] https://github.com/morfologik/morfologik-stemming/blob/master/morfologik-speller/src/main/java/morfologik/speller/Speller.java#L601
------------------------------------------------------------------------------
_______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel