Re: [Jprogramming] Simple and effective spelling corrector

Dan Bron Mon, 23 Nov 2015 09:49:13 -0800

Thanks, I had missed that.  Appreciate the pointer.

-Dan



> On Nov 23, 2015, at 12:45 PM, Raul Miller <[email protected]> wrote:
> 
> If you search for the string big.txt on the page
> http://norvig.com/spell-correct.html the second instance of that
> string is a link to http://norvig.com/big.txt and that is what I used.
> 
> Thanks,
> 
> -- 
> Raul
> 
> 
> On Mon, Nov 23, 2015 at 12:21 PM, Dan Bron <[email protected]> wrote:
>> Oh, interesting; I’m going to have to study this in more detail.  Thank you.
>> 
>> What did you use for your corpus? (big.txt)
>> 
>> -Dan
>> 
>> 
>>> On Nov 20, 2015, at 2:55 PM, Raul Miller <[email protected]> wrote:
>>> 
>>> If I have read his implementation properly, it works something like this:
>>> 
>>> require'regex'
>>> RX_OPTIONS_UTF8=: 0
>>> 
>>> alphabet=: (#~ ] ~: toupper) a.
>>> 
>>> countbigwords=:3 :0
>>> NB. handle persistent data explicitly
>>> big=. tolower fread '~user/temp/big.txt'
>>> they=. ;:(' ' (I.-.big e. alphabet)} big)
>>> words=: ~.they
>>> count=: (#/.~ they),0
>>> i.0 0
>>> )
>>> 
>>> alt=:4 :0
>>> c=. x{y
>>> (alphabet-.c) x}each<y
>>> )
>>> dubin=:4 :0
>>> (x{.y)&,each ,&(x}.y)each alphabet
>>> )
>>> 
>>> edits=:3 :0
>>> del=. 1 <\. y
>>> trn=. ((<-1 2)&C.each }.<\y),each 2}.(<\.y),a:
>>> rpl=. ;alt&y each i.#y
>>> ins=. ~.;dubin&y each i.1+#y
>>> del,trn,rpl,ins
>>> )
>>> 
>>> best=:3 :0
>>> n=. words i. y
>>> y{~(i. >./)n{count
>>> )
>>> 
>>> correct=:3 :0
>>> w=. <y
>>> if. w e. words do. w return. end.
>>> e=. edits y
>>> if. 1 e. e e. words do. best e return. end.
>>> e2=. ;edits each e
>>> if. 1 e. e2 e. words do. best e2 return. end.
>>> w
>>> )
>>> 
>>> countbigwords''
>>> 
>>> Seems plausible enough on a few simple tests.
>>> 
>>> Example use:
>>> 
>>>  correct 'thatl'
>>> +----+
>>> |that|
>>> +----+
>>> 
>>> Thanks,
>>> 
>>> --
>>> Raul
>>> 
>>> On Thu, Nov 19, 2015 at 12:06 PM, Dan Bron <[email protected]> wrote:
>>>> Peter Norvig has a blog entry on how to write a fairly effective spelling 
>>>> corrector (75-90%) in very little code, using some Bayesian analysis:
>>>> 
>>>>    http://norvig.com/spell-correct.html 
>>>> <http://norvig.com/spell-correct.html>
>>>> 
>>>> A worthwhile read.
>>>> 
>>>> I’m using this program as an exercise in learning Perl6 (which, believe it 
>>>> or not, now has an official release date). I wonder though, how would it 
>>>> look in J?
>>>> 
>>>> -Dan
>>>> ----------------------------------------------------------------------
>>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>> ----------------------------------------------------------------------
>>> For information about J forums see http://www.jsoftware.com/forums.htm
>> 
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Simple and effective spelling corrector

Reply via email to