This is the modified Speller.java. The idea is more or less the same that is found in Jan Daciuk's code. More testing in different languages is needed, because there are a many details to consider and perhaps it's buggy.
When a possible multiple character substitution is found, a new branch is started with findRepl() with two parameters (correctionWord, correctionCandidate) that indicate the correction to be applied to the original word or to the candidate, so that everything else in the algorithm works as usual. When there is no substitution, correctionWord= correctionCandidate=0 and everything works as before. The only substitutions coded now are L -> L·L (correctionCandidate=2) and L·L-> L (correctionWord=2) for Catalan. I suppose that these substitution should go in the .info file of each language dictionary. There is a problem to be solved. The L -> L·L substitution adds a distance of 0, but the L·L-> L substitution adds 1. It should be always 0. Regards, Jaume
Speller.java
Description: Binary data
------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________ Languagetool-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/languagetool-devel
