Thanks, I had missed that. Appreciate the pointer. -Dan
> On Nov 23, 2015, at 12:45 PM, Raul Miller <[email protected]> wrote: > > If you search for the string big.txt on the page > http://norvig.com/spell-correct.html the second instance of that > string is a link to http://norvig.com/big.txt and that is what I used. > > Thanks, > > -- > Raul > > > On Mon, Nov 23, 2015 at 12:21 PM, Dan Bron <[email protected]> wrote: >> Oh, interesting; I’m going to have to study this in more detail. Thank you. >> >> What did you use for your corpus? (big.txt) >> >> -Dan >> >> >>> On Nov 20, 2015, at 2:55 PM, Raul Miller <[email protected]> wrote: >>> >>> If I have read his implementation properly, it works something like this: >>> >>> require'regex' >>> RX_OPTIONS_UTF8=: 0 >>> >>> alphabet=: (#~ ] ~: toupper) a. >>> >>> countbigwords=:3 :0 >>> NB. handle persistent data explicitly >>> big=. tolower fread '~user/temp/big.txt' >>> they=. ;:(' ' (I.-.big e. alphabet)} big) >>> words=: ~.they >>> count=: (#/.~ they),0 >>> i.0 0 >>> ) >>> >>> alt=:4 :0 >>> c=. x{y >>> (alphabet-.c) x}each<y >>> ) >>> dubin=:4 :0 >>> (x{.y)&,each ,&(x}.y)each alphabet >>> ) >>> >>> edits=:3 :0 >>> del=. 1 <\. y >>> trn=. ((<-1 2)&C.each }.<\y),each 2}.(<\.y),a: >>> rpl=. ;alt&y each i.#y >>> ins=. ~.;dubin&y each i.1+#y >>> del,trn,rpl,ins >>> ) >>> >>> best=:3 :0 >>> n=. words i. y >>> y{~(i. >./)n{count >>> ) >>> >>> correct=:3 :0 >>> w=. <y >>> if. w e. words do. w return. end. >>> e=. edits y >>> if. 1 e. e e. words do. best e return. end. >>> e2=. ;edits each e >>> if. 1 e. e2 e. words do. best e2 return. end. >>> w >>> ) >>> >>> countbigwords'' >>> >>> Seems plausible enough on a few simple tests. >>> >>> Example use: >>> >>> correct 'thatl' >>> +----+ >>> |that| >>> +----+ >>> >>> Thanks, >>> >>> -- >>> Raul >>> >>> On Thu, Nov 19, 2015 at 12:06 PM, Dan Bron <[email protected]> wrote: >>>> Peter Norvig has a blog entry on how to write a fairly effective spelling >>>> corrector (75-90%) in very little code, using some Bayesian analysis: >>>> >>>> http://norvig.com/spell-correct.html >>>> <http://norvig.com/spell-correct.html> >>>> >>>> A worthwhile read. >>>> >>>> I’m using this program as an exercise in learning Perl6 (which, believe it >>>> or not, now has an official release date). I wonder though, how would it >>>> look in J? >>>> >>>> -Dan >>>> ---------------------------------------------------------------------- >>>> For information about J forums see http://www.jsoftware.com/forums.htm >>> ---------------------------------------------------------------------- >>> For information about J forums see http://www.jsoftware.com/forums.htm >> >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
