Re: [Jprogramming] Simple and effective spelling corrector

Raul Miller Sat, 21 Nov 2015 09:33:06 -0800

You are correct, I should have cleaned those out.

As for "similarly but not the same", I expect that having not used
perl6 would be sufficient for that to be true. Perhaps also I should
be returning a flat string instead of a boxed string. That would be a
simple change. But I believe I am returning the same word choice for
any specific example, despite any claims he might have made in his
writeup.


Let me know if you can point to any counterexamples?

Thanks,

-- 
Raul

On Sat, Nov 21, 2015 at 6:55 AM, Joe Bogner <[email protected]> wrote:
> Raul, thanks for sharing. It's seems to work similarly, if not the same, as
> what's in the article. I particularly like your use of C. which I rarely
> see in the wild.
>
> note: It does't seem like the regex require or RX_OPTIONS_UTF8=: 0 is
> needed... I would guess that was leftover from your previous message on rx
> and utf8
>
> On Fri, Nov 20, 2015 at 2:55 PM, Raul Miller <[email protected]> wrote:
>
>> If I have read his implementation properly, it works something like this:
>>
>> require'regex'
>> RX_OPTIONS_UTF8=: 0
>>
>> alphabet=: (#~ ] ~: toupper) a.
>>
>> countbigwords=:3 :0
>>   NB. handle persistent data explicitly
>>   big=. tolower fread '~user/temp/big.txt'
>>   they=. ;:(' ' (I.-.big e. alphabet)} big)
>>   words=: ~.they
>>   count=: (#/.~ they),0
>>   i.0 0
>> )
>>
>> alt=:4 :0
>>   c=. x{y
>>   (alphabet-.c) x}each<y
>> )
>> dubin=:4 :0
>>   (x{.y)&,each ,&(x}.y)each alphabet
>> )
>>
>> edits=:3 :0
>>   del=. 1 <\. y
>>   trn=. ((<-1 2)&C.each }.<\y),each 2}.(<\.y),a:
>>   rpl=. ;alt&y each i.#y
>>   ins=. ~.;dubin&y each i.1+#y
>>   del,trn,rpl,ins
>> )
>>
>> best=:3 :0
>>   n=. words i. y
>>   y{~(i. >./)n{count
>> )
>>
>> correct=:3 :0
>>   w=. <y
>>   if. w e. words do. w return. end.
>>   e=. edits y
>>   if. 1 e. e e. words do. best e return. end.
>>   e2=. ;edits each e
>>   if. 1 e. e2 e. words do. best e2 return. end.
>>   w
>> )
>>
>> countbigwords''
>>
>> Seems plausible enough on a few simple tests.
>>
>> Example use:
>>
>>    correct 'thatl'
>> +----+
>> |that|
>> +----+
>>
>> Thanks,
>>
>> --
>> Raul
>>
>> On Thu, Nov 19, 2015 at 12:06 PM, Dan Bron <[email protected]> wrote:
>> > Peter Norvig has a blog entry on how to write a fairly effective
>> spelling corrector (75-90%) in very little code, using some Bayesian
>> analysis:
>> >
>> >      http://norvig.com/spell-correct.html <
>> http://norvig.com/spell-correct.html>
>> >
>> > A worthwhile read.
>> >
>> > I’m using this program as an exercise in learning Perl6 (which, believe
>> it or not, now has an official release date). I wonder though, how would it
>> look in J?
>> >
>> > -Dan
>> > ----------------------------------------------------------------------
>> > For information about J forums see http://www.jsoftware.com/forums.htm
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Simple and effective spelling corrector

Reply via email to