I just started an experiment, I have been wanting to do for a long time, which I want to share with you.
Normally, we try to signal errors for part of a sentence and the surrounding words. My approach for this experiment starts at the other end. It takes sentences, and tries to find sentence-long patterns. Because so many words have double meanings and postags, the approach is: - get a sentence (discard too long sentence, they are not considered to be correct anyway, and make things too slow as yet) - make all permutations of the sentence using the words and postags - count the number of occurrences of al these patterns - analyse the patterns with the highest frequencies for correctness An then: - deliberately introduce probable errors into those patterns to create (full sentence) error detecting patterns. It is an experiment; it might not have good results at all, but I am hopeful. It is a brute-force approach, so it takes some computer time. I will inform you of the results in time to come. Ruud ------------------------------------------------------------------------------ Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel