Hi

On 6 Aug 2009, at 09:26, Pierre Valade wrote:

>
> Hi Pat,
>
> Thanks for your answer. I'll look into raspell.

http://norvig.com/spell-correct.html may also be useful. Peter Norvig  
is Director of Research at Google.


It's a spelling corrector based on bayesian analysis in 20-something  
lines of python code. Only downside is that you need to train it with  
a word list which could take a while to run.

There's a ruby version here - 
http://lojic.com/blog/2008/09/04/how-to-write-a-spelling-corrector-in-ruby/ 
  at around 40 lines.


Quote from Peter Norvig's site:

> In the past week, two friends (Dean and Bill) independently told me  
> they were amazed at how Google does spelling correction so well and  
> quickly. Type in a search like [speling] and Google comes back in  
> 0.1 seconds or so with Did you mean: spelling. (Yahoo and Microsoft  
> are similar.) What surprised me is that I thought Dean and Bill,  
> being highly accomplished engineers and mathematicians, would have  
> good intuitions about statistical language processing problems such  
> as spelling correction. But they didn't, and come to think of it,  
> there's no reason they should: it was my expectations that were  
> faulty, not their knowledge.
> I figured they and many others could benefit from an explanation.  
> The full details of an industrial-strength spell corrector like  
> Google's would be more confusing than enlightening, but I figured  
> that on the plane flight home, in less than a page of code, I could  
> write a toy spelling corrector that achieves 80 or 90% accuracy at a  
> processing speed of at least 10 words per second
>

Oskar

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to