W dniu 2013-12-31 15:04, Marcin Miłkowski pisze: > Hi, > > W dniu 2013-12-31 13:20, Jaume Ortolà i Font pisze: >> Hi, >> >> In the current implementation the number of possible suggestions grows >> exponentially with the replacement pairs, which is not a good thing... >> For "Milkowski" you get 6144 possible suggestions in American English. I >> fixed a limit of 7 possible simultaneous replacements in a word, which >> (if the replacements are one to two) gives you 2^7=128 possible >> suggestions. But in the case of "Milkowski" in American English, almost >> all letters have 3 o 4 possible replacements (which I didn't forsee), so >> the limit is about 4^7=16384. > I'd get rid of these multiple replacements. They are spurious anyway, I > guess. Some of them seem to be repeated (I just copied these from > hunspell). I will look at them again. I did look at them again and it turns out that most only slowed down the process but did not help in creating good suggestions as short fragments were not put together the way they should. So I went ahead and removed a fair number of the replacement pairs but we still need to find ways to limit search if we have too many replacement pairs.
I don't think traversing the automaton directly would be faster because we stop searching the automaton early anyway (as soon as the letter is not found). I'm afraid the truth is that we have to be very careful when adding replacement pairs, as completely ignoring one-letter replacements is not a good idea (at least not for Polish, in which people confuse "h" and "ch"). Best, Marcin ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel