Re: [Jprogramming] Simple and effective spelling corrector

Raul Miller Fri, 20 Nov 2015 12:01:59 -0800

If I have read his implementation properly, it works something like this:

require'regex'
RX_OPTIONS_UTF8=: 0


alphabet=: (#~ ] ~: toupper) a.

countbigwords=:3 :0
  NB. handle persistent data explicitly
  big=. tolower fread '~user/temp/big.txt'
  they=. ;:(' ' (I.-.big e. alphabet)} big)
  words=: ~.they
  count=: (#/.~ they),0
  i.0 0
)

alt=:4 :0
  c=. x{y
  (alphabet-.c) x}each<y
)
dubin=:4 :0
  (x{.y)&,each ,&(x}.y)each alphabet
)

edits=:3 :0
  del=. 1 <\. y
  trn=. ((<-1 2)&C.each }.<\y),each 2}.(<\.y),a:
  rpl=. ;alt&y each i.#y
  ins=. ~.;dubin&y each i.1+#y
  del,trn,rpl,ins
)

best=:3 :0
  n=. words i. y
  y{~(i. >./)n{count
)

correct=:3 :0
  w=. <y
  if. w e. words do. w return. end.
  e=. edits y
  if. 1 e. e e. words do. best e return. end.
  e2=. ;edits each e
  if. 1 e. e2 e. words do. best e2 return. end.
  w
)

countbigwords''

Seems plausible enough on a few simple tests.

Example use:

   correct 'thatl'
+----+
|that|
+----+

Thanks,

-- 
Raul

On Thu, Nov 19, 2015 at 12:06 PM, Dan Bron <[email protected]> wrote:
> Peter Norvig has a blog entry on how to write a fairly effective spelling 
> corrector (75-90%) in very little code, using some Bayesian analysis:
>
>      http://norvig.com/spell-correct.html 
> <http://norvig.com/spell-correct.html>
>
> A worthwhile read.
>
> I’m using this program as an exercise in learning Perl6 (which, believe it or 
> not, now has an official release date). I wonder though, how would it look in 
> J?
>
> -Dan
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Simple and effective spelling corrector

Reply via email to