W dniu 2014-01-04 02:16, Jaume Ortolà i Font pisze:
> 2014/1/3 Marcin Miłkowski <list-addr...@wp.pl <mailto:list-addr...@wp.pl>>
>
>     Yes, you are right. I did some profiling and indeed, our main problem is
>     that findRepl on line 306 in Speller.java is run zillions of times.
>
>     One easy but brutal fix would be to run findRepl only on the original
>     word and use replacement pairs only to generate complete candidates.
>     This would save a lot of time but on the pain of quality. Maybe it would
>     be better before we have a proper traversal routine?
>
>     The only way to find out whether the quality of suggestions really drops
>     is to run it on common misspellings with and without findRepl on
>     wordsToCheck list. Could you make this experiment on Catalan data?
>
>
> In that case there will be no suggestions for words with two errors: a
> replacement pair + another error. That was the purpose.
>
> I think we can choose a middle ground. It makes no sense to check 6.000
> word candidates (none of which exists in the dictionary). But it makes
> sense to check three, four or a dozen words. With a limit of 4, all
> tests I can imagine in Catalan are passed. So I suggest a number of 10
> or 15.

I did and after changing the morfologik speller library in 2.4 with a 
new binary with the upper limit, there's no problem with EN_US anymore, 
as we're very fast again.

Maybe then this is the proper fix? I can ask Dawid to release this small 
bugfix version and this way we could have a sure fix for all languages 
until we find a proper traversal algorithm?

Best,
Marcin

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to