I don't understand most of the technicalities, but I can stat that for Dutch, multiple single and multiple character substitutions are necessary to get the proper word.
Example: cocoskoekie is very informal an wrong for kokoskoekje ( twice the common c-k interchange, once the common kie-kje exchange). Ruud > As predicted, the code I wrote for multiple character substitutions had > several bugs. I solved them (see the attachment), but more problems could > arise with other languages or other substitutions. > > Here I would like to talk about another approach for generating spelling > suggestions: just checking the words with substitutions directly. Several > steps could be done, but each step is taken only if no suggestions have > been found in the previous one. These could be the steps: > > 1) Make a tree search. > 2) Prepare words with substitutions. Are they misspelled words? > 3) Make a new tree search of words with substitutions. > > Note that step 2) is very low cost, and step 3) is high cost. Step 2) > could > even be the first step. > > Would this approach be more or less efficient? It depends on the kind and > the number of errors we find in the texts. When there is only one or more > errors of multiple character substitution, then it will be faster. When > there is one error of multiple character substitution plus another kind of > error, then it will be slower. So the only way to decide which is better > is > to try both and see which is better statistically. > > Note that using multiple character substitution inside the tree search > algorithm is not so costly as repeating the tree search, but it is > something in between. > > Best regards, > Jaume > ------------------------------------------------------------------------------ > Try New Relic Now & We'll Send You this Cool Shirt > New Relic is the only SaaS-based application performance monitoring > service > that delivers powerful full stack analytics. Optimize and monitor your > browser, app, & servers with just a few lines of code. Try New Relic > and get this awesome Nerd Life shirt! > http://p.sf.net/sfu/newrelic_d2d_apr_______________________________________________ > Languagetool-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/languagetool-devel > ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr _______________________________________________ Languagetool-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/languagetool-devel
