You were righ about in not working on a big corpus

I think there is a limit to the query and it would exceed it on a big corpus


I am myself looking at such a similar thing but going through the basics.

Rgds
Prabhu


On 4/3/06, karl wettin <[EMAIL PROTECTED]> wrote:
>
>
> 31 mar 2006 kl. 06.54 skrev karl wettin:
>
> > I've been working a bit with the spell checker. It does a pretty
> > good job when it comes to finding a smiple typo.
> > I was thinking it would be nice if I could turn "heros light and
> > magic" to "did you mean: heroes of might and magic?".
> >
> > My strategy is to combine Markov, A* and Levenstein.
>
> > Any comments on this? Questions?
>
> Nothing? Not even a go-go-go? I would really like to discuss it with
> someone before I spend too much time on it. This is what it is: a
> simple Markov chain is similar to ngrams, but on a word level rather
> than character level. A* is a classic gaming algorithm to find the
> cheapest path in a matrix. I assume you all know Levenstein from
> FuzzyQuery.
>
> I have been sleeping on this a bit and think it might not work on a
> big corpus. One probably have to limit it to one Markov chain per
> context of some kind. Say category or so.
>
> Perhaps there is some other forum more focused on text analysis you
> would like to recommend me?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Reply via email to