Levdist=: 4 : 0
'a b'=. (\:#&>) x;y
p=: 2 0$0
z=.0
while. a *.&# b do.
t=. a ((i.<./)@:(+...@#)([>:@,~{)])@:i. b
z=.z+<:>./t
p=:p,.t {.&.> a;b
'a b'=. t }.&.> a;b
end.
z=.z+ a +&# b
)
'excused' Levdist 'exhausted'
3
p NB. exhibits distance cf. http://www.levenshtein.net/index.html
+-+-+---+-+--+-+
|e|x|hau|s|te|d|
+-+-+---+-+--+-+
|e|x|cu |s|e |d|
+-+-+---+-+--+-+
'levenshtein' Levdist 'malenstein'
5 NB. is the right answer
p
+---+-+---+-+--+-+-+-+
|l |e|ven|s|ht|e|i|n|
+---+-+---+-+--+-+-+-+
|mal|e|n |s|t |e|i|n|
+---+-+---+-+--+-+-+-+
ts'''excused'' Levdist&(100,@(#,:)]) ''exhausted'''
0.062798674 88000
ts'''excused'' levdist&(100,@(#,:)]) ''exhausted'''
|limit error: levdist
| stap"1 xs,"0/&}.ys
ts'''excused'' levdist&(10,@(#,:)]) ''exhausted'''
0.34071143 2.6334221e8
ts'''excused'' Levdist&(10,@(#,:)]) ''exhausted'''
0.0019793917 4672
'excused' (Levdist-:levdist)&(10,@(#,:)]) 'exhausted'
1
R.E Boss
> -----Oorspronkelijk bericht-----
> Van: [email protected] [mailto:programming-
> [email protected]] Namens Aai
> Verzonden: dinsdag 28 april 2009 17:15
> Aan: Programming forum
> Onderwerp: Re: [Jprogramming] Levenshtein distance
>
> Here are IMO some optimizations:
>
> offsets for indices of neighbors advance : B
> sentence for initial matrix C much shorter
> 'match' matrix of same size, in advance : M
> no boxing of indices
>
> levdist=: 3 : 0
> :
> B=: 3 2$_1 0 0 _1 _1 _1
> M=: 0,0,. x =/ y
> C=: (ys=.i.>:#y),,.}.xs=.i.>:#x
> stap"1 xs ,"0/&}. ys
> {:, C
> )
>
> stap=: 3 : 0
> u=. >: <./ (0 0,- M {~ <y) + C{~;/B +"1/ y
> C=: u (<y)}C
> )
>
>
> 'excused' levdist 'exhausted'
> 3
>
> 'levenshtein' levdist 'malenstein'
> 4
>
>
>
>
> Hallo Jan Jacobs, je schreef op 28-04-09 11:42:
> > ls,
> > I modeled the edit-distance (or Levenshtein-distance
> > http://en.wikipedia.org/wiki/Levenshtein_distance) function for
> > strings, see below. I am not very proud of it. Hints for improvements
> > on elegancy
> > are welcomed (perhaps by using Sequential Machine ;: ??).
> > Thanks in advance,
> > Jan.
> >
> > NB. Levenshtein
> > NB. y. ~ word1
> > NB. x. ~ word2
> > Levenshtein=:3 : 0
> > :
> > ]C=:(i.>:#x)(<"1(0,.~xs=.i.>:#x))}
> > (ys=.i.>:#y)(0)}0$~>:(#X=:x),#Y=:y NB. init
> > ]ind=.,<"1 (}.xs),"0/ }.ys
> > NB. relevant indices
> > step&.> ind
> > {:{:"1 C
> > )
> > NB. make single step in C matrix
> > NB. y. ~ current position
> > step=:3 : 0
> > ]ins=.>:C{~<_1 0+y
> > ]del=.>:C{~<0 _1+y
> > ]nochxc=.(>:C{~<_1 _1+y)-(X{~<:{.y)=Y{~<:{:y
> > ]C=:(<./ins,del,nochxc)(<y)}C
> > )
> > d=:1!:2&2
> > d 'asap'Levenshtein'aap' NB. 1
> > d 'excused'Levenshtein'exhausted' NB. 3
> >
> >
> >
> > Jan Jacobs
> > Esdoornstraat 33
> > 5995AN Kessel
> > T: +31 77 462 1887
> > M: +31 6 23 82 55 21
> > E: [email protected]
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> >
>
> --
> =@@i
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm