I've been working a bit with the spell checker. It does a pretty good
job when it comes to finding a smiple typo.
I was thinking it would be nice if I could turn "heros light and
magic" to "did you mean: heroes of might and magic?".
My strategy is to combine Markov, A* and Levenstein.
Algorithm:
First I have to train the Markov chain with the token offsets from
Lucene.
At query time I choose the cheapest A* path though the Markov chain
with as short Levenstien distance as possible.
I choose A* over breadth-first to allow zero-cost for stop words and
future contextual boosting.
Any comments on this? Questions?
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]