I don't understand most of the technicalities, but I can stat that for
Dutch, multiple single and multiple character substitutions are necessary
to get the proper word.

Example: cocoskoekie is very informal an wrong for kokoskoekje ( twice the
common c-k interchange, once the common kie-kje exchange).

Ruud

> As predicted, the code I wrote for multiple character substitutions had
> several bugs. I solved them (see the attachment), but more problems could
> arise with other languages or other substitutions.
>
> Here I would like to talk about another approach for generating spelling
> suggestions: just checking the words with substitutions directly. Several
> steps could be done, but each step is taken only if no suggestions have
> been found in the previous one. These could be the steps:
>
> 1) Make a tree search.
> 2) Prepare words with substitutions. Are they misspelled words?
> 3) Make a new tree search of words with substitutions.
>
> Note that step 2) is very low cost, and step 3) is high cost. Step 2)
> could
> even be the first step.
>
> Would this approach be more or less efficient? It depends on the kind and
> the number of errors we find in the texts. When there is only one or more
> errors of multiple character substitution, then it will be faster. When
> there is one error of multiple character substitution plus another kind of
> error, then it will be slower. So the only way to decide which is better
> is
> to try both and see which is better statistically.
>
> Note that using multiple character substitution inside the tree search
> algorithm is not so costly as repeating the tree search, but it is
> something in between.
>
> Best regards,
> Jaume
> ------------------------------------------------------------------------------
> Try New Relic Now & We'll Send You this Cool Shirt
> New Relic is the only SaaS-based application performance monitoring
> service
> that delivers powerful full stack analytics. Optimize and monitor your
> browser, app, & servers with just a few lines of code. Try New Relic
> and get this awesome Nerd Life shirt!
> http://p.sf.net/sfu/newrelic_d2d_apr_______________________________________________
> Languagetool-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>



------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to