W dniu 2014-01-04 02:16, Jaume Ortolà i Font pisze: > 2014/1/3 Marcin Miłkowski <list-addr...@wp.pl <mailto:list-addr...@wp.pl>> > > Yes, you are right. I did some profiling and indeed, our main problem is > that findRepl on line 306 in Speller.java is run zillions of times. > > One easy but brutal fix would be to run findRepl only on the original > word and use replacement pairs only to generate complete candidates. > This would save a lot of time but on the pain of quality. Maybe it would > be better before we have a proper traversal routine? > > The only way to find out whether the quality of suggestions really drops > is to run it on common misspellings with and without findRepl on > wordsToCheck list. Could you make this experiment on Catalan data? > > > In that case there will be no suggestions for words with two errors: a > replacement pair + another error. That was the purpose. > > I think we can choose a middle ground. It makes no sense to check 6.000 > word candidates (none of which exists in the dictionary). But it makes > sense to check three, four or a dozen words. With a limit of 4, all > tests I can imagine in Catalan are passed. So I suggest a number of 10 > or 15.
I did and after changing the morfologik speller library in 2.4 with a new binary with the upper limit, there's no problem with EN_US anymore, as we're very fast again. Maybe then this is the proper fix? I can ask Dawid to release this small bugfix version and this way we could have a sure fix for all languages until we find a proper traversal algorithm? Best, Marcin ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel