31 mar 2006 kl. 06.54 skrev karl wettin:

I've been working a bit with the spell checker. It does a pretty good job when it comes to finding a smiple typo. I was thinking it would be nice if I could turn "heros light and magic" to "did you mean: heroes of might and magic?".

My strategy is to combine Markov, A* and Levenstein.

Any comments on this? Questions?

Nothing? Not even a go-go-go? I would really like to discuss it with someone before I spend too much time on it. This is what it is: a simple Markov chain is similar to ngrams, but on a word level rather than character level. A* is a classic gaming algorithm to find the cheapest path in a matrix. I assume you all know Levenstein from FuzzyQuery.

I have been sleeping on this a bit and think it might not work on a big corpus. One probably have to limit it to one Markov chain per context of some kind. Say category or so.

Perhaps there is some other forum more focused on text analysis you would like to recommend me?

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to