I just started an experiment, I have been wanting to do for a long time,
which I want to share with you.

Normally, we try to signal errors for part of a sentence and the
surrounding words.

My approach for this experiment starts at the other end. It takes
sentences, and tries to find sentence-long patterns.

Because so many words have double meanings and postags, the approach is:
- get a sentence
(discard too long sentence, they are not considered to be correct anyway,
and make things too slow as yet)
- make all permutations of the sentence using the words and postags
- count the number of occurrences of al these patterns
- analyse the patterns with the highest frequencies for correctness

An then:
- deliberately introduce probable errors into those patterns to create
(full sentence) error detecting patterns.

It is an experiment; it might not have good results at all, but I am
hopeful. It is a brute-force approach, so it takes some computer time.

I will inform you of the results in time to come.

Ruud


------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to