W dniu 2014-01-08 11:34, Daniel Naber pisze: > Hi, > > I think these changes in matching might be caused be the recent changes > to the tokenizer? > > https://languagetool.org/regression-tests/20140107/result_de_20140107.html > https://languagetool.org/regression-tests/20140107/result_en_20140107.html > > Are these changes intended? At least for German it looks like the rule > should still match (not that I find that rule very useful).
Actually, this is because I made a change in the class AbstractCompoundRule - now it returns true in isSpellingRule(), which should be the case given the LocQualityIssueType. This makes red underlines to appear in the GUI (which I also made more consistent recently), but I think that the wiki testing filters out all spelling rules, so no compound rule is ever tested. Which is probably wrong. Speaking of tokenization, I'd like to add more tokenizing characters to the generic word tokenizer - I added '*', '=' and '#' to the English word tokenizer because it helps to spell-check. These characters are not parts of standard words. But tests in German fail when I add '*' and '=', which is related to one of the whitespace rules. Could you look at it? I think these characters should not be treated as parts of words. Also, '#' makes Breton tokenizer to fail because of the hack used there to tokenize words. This should be finally replaced with something better, probably in the style of the English tokenizer, but for the time being, we could use some other special codes, for example by using special control characters. Best, Marcin ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel