I've been working a bit with the spell checker. It does a pretty good job when it comes to finding a smiple typo. I was thinking it would be nice if I could turn "heros light and magic" to "did you mean: heroes of might and magic?".

My strategy is to combine Markov, A* and Levenstein.

Algorithm:

First I have to train the Markov chain with the token offsets from Lucene.

At query time I choose the cheapest A* path though the Markov chain with as short Levenstien distance as possible.

I choose A* over breadth-first to allow zero-cost for stop words and future contextual boosting.

Any comments on this? Questions?

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to