Oh, interesting; I’m going to have to study this in more detail. Thank you.
What did you use for your corpus? (big.txt) -Dan > On Nov 20, 2015, at 2:55 PM, Raul Miller <[email protected]> wrote: > > If I have read his implementation properly, it works something like this: > > require'regex' > RX_OPTIONS_UTF8=: 0 > > alphabet=: (#~ ] ~: toupper) a. > > countbigwords=:3 :0 > NB. handle persistent data explicitly > big=. tolower fread '~user/temp/big.txt' > they=. ;:(' ' (I.-.big e. alphabet)} big) > words=: ~.they > count=: (#/.~ they),0 > i.0 0 > ) > > alt=:4 :0 > c=. x{y > (alphabet-.c) x}each<y > ) > dubin=:4 :0 > (x{.y)&,each ,&(x}.y)each alphabet > ) > > edits=:3 :0 > del=. 1 <\. y > trn=. ((<-1 2)&C.each }.<\y),each 2}.(<\.y),a: > rpl=. ;alt&y each i.#y > ins=. ~.;dubin&y each i.1+#y > del,trn,rpl,ins > ) > > best=:3 :0 > n=. words i. y > y{~(i. >./)n{count > ) > > correct=:3 :0 > w=. <y > if. w e. words do. w return. end. > e=. edits y > if. 1 e. e e. words do. best e return. end. > e2=. ;edits each e > if. 1 e. e2 e. words do. best e2 return. end. > w > ) > > countbigwords'' > > Seems plausible enough on a few simple tests. > > Example use: > > correct 'thatl' > +----+ > |that| > +----+ > > Thanks, > > -- > Raul > > On Thu, Nov 19, 2015 at 12:06 PM, Dan Bron <[email protected]> wrote: >> Peter Norvig has a blog entry on how to write a fairly effective spelling >> corrector (75-90%) in very little code, using some Bayesian analysis: >> >> http://norvig.com/spell-correct.html >> <http://norvig.com/spell-correct.html> >> >> A worthwhile read. >> >> I’m using this program as an exercise in learning Perl6 (which, believe it >> or not, now has an official release date). I wonder though, how would it >> look in J? >> >> -Dan >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
