You are correct, I should have cleaned those out. As for "similarly but not the same", I expect that having not used perl6 would be sufficient for that to be true. Perhaps also I should be returning a flat string instead of a boxed string. That would be a simple change. But I believe I am returning the same word choice for any specific example, despite any claims he might have made in his writeup.
Let me know if you can point to any counterexamples? Thanks, -- Raul On Sat, Nov 21, 2015 at 6:55 AM, Joe Bogner <[email protected]> wrote: > Raul, thanks for sharing. It's seems to work similarly, if not the same, as > what's in the article. I particularly like your use of C. which I rarely > see in the wild. > > note: It does't seem like the regex require or RX_OPTIONS_UTF8=: 0 is > needed... I would guess that was leftover from your previous message on rx > and utf8 > > On Fri, Nov 20, 2015 at 2:55 PM, Raul Miller <[email protected]> wrote: > >> If I have read his implementation properly, it works something like this: >> >> require'regex' >> RX_OPTIONS_UTF8=: 0 >> >> alphabet=: (#~ ] ~: toupper) a. >> >> countbigwords=:3 :0 >> NB. handle persistent data explicitly >> big=. tolower fread '~user/temp/big.txt' >> they=. ;:(' ' (I.-.big e. alphabet)} big) >> words=: ~.they >> count=: (#/.~ they),0 >> i.0 0 >> ) >> >> alt=:4 :0 >> c=. x{y >> (alphabet-.c) x}each<y >> ) >> dubin=:4 :0 >> (x{.y)&,each ,&(x}.y)each alphabet >> ) >> >> edits=:3 :0 >> del=. 1 <\. y >> trn=. ((<-1 2)&C.each }.<\y),each 2}.(<\.y),a: >> rpl=. ;alt&y each i.#y >> ins=. ~.;dubin&y each i.1+#y >> del,trn,rpl,ins >> ) >> >> best=:3 :0 >> n=. words i. y >> y{~(i. >./)n{count >> ) >> >> correct=:3 :0 >> w=. <y >> if. w e. words do. w return. end. >> e=. edits y >> if. 1 e. e e. words do. best e return. end. >> e2=. ;edits each e >> if. 1 e. e2 e. words do. best e2 return. end. >> w >> ) >> >> countbigwords'' >> >> Seems plausible enough on a few simple tests. >> >> Example use: >> >> correct 'thatl' >> +----+ >> |that| >> +----+ >> >> Thanks, >> >> -- >> Raul >> >> On Thu, Nov 19, 2015 at 12:06 PM, Dan Bron <[email protected]> wrote: >> > Peter Norvig has a blog entry on how to write a fairly effective >> spelling corrector (75-90%) in very little code, using some Bayesian >> analysis: >> > >> > http://norvig.com/spell-correct.html < >> http://norvig.com/spell-correct.html> >> > >> > A worthwhile read. >> > >> > I’m using this program as an exercise in learning Perl6 (which, believe >> it or not, now has an official release date). I wonder though, how would it >> look in J? >> > >> > -Dan >> > ---------------------------------------------------------------------- >> > For information about J forums see http://www.jsoftware.com/forums.htm >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm >> > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
