Here are IMO some optimizations:
offsets for indices of neighbors advance : B
sentence for initial matrix C much shorter
'match' matrix of same size, in advance : M
no boxing of indices
levdist=: 3 : 0
:
B=: 3 2$_1 0 0 _1 _1 _1
M=: 0,0,. x =/ y
C=: (ys=.i.>:#y),,.}.xs=.i.>:#x
stap"1 xs ,"0/&}. ys
{:, C
)
stap=: 3 : 0
u=. >: <./ (0 0,- M {~ <y) + C{~;/B +"1/ y
C=: u (<y)}C
)
'excused' levdist 'exhausted'
3
'levenshtein' levdist 'malenstein'
4
Hallo Jan Jacobs, je schreef op 28-04-09 11:42:
> ls,
> I modeled the edit-distance (or Levenshtein-distance
> http://en.wikipedia.org/wiki/Levenshtein_distance) function for
> strings, see below. I am not very proud of it. Hints for improvements
> on elegancy
> are welcomed (perhaps by using Sequential Machine ;: ??).
> Thanks in advance,
> Jan.
>
> NB. Levenshtein
> NB. y. ~ word1
> NB. x. ~ word2
> Levenshtein=:3 : 0
> :
> ]C=:(i.>:#x)(<"1(0,.~xs=.i.>:#x))}
> (ys=.i.>:#y)(0)}0$~>:(#X=:x),#Y=:y NB. init
> ]ind=.,<"1 (}.xs),"0/ }.ys
> NB. relevant indices
> step&.> ind
> {:{:"1 C
> )
> NB. make single step in C matrix
> NB. y. ~ current position
> step=:3 : 0
> ]ins=.>:C{~<_1 0+y
> ]del=.>:C{~<0 _1+y
> ]nochxc=.(>:C{~<_1 _1+y)-(X{~<:{.y)=Y{~<:{:y
> ]C=:(<./ins,del,nochxc)(<y)}C
> )
> d=:1!:2&2
> d 'asap'Levenshtein'aap' NB. 1
> d 'excused'Levenshtein'exhausted' NB. 3
>
>
>
> Jan Jacobs
> Esdoornstraat 33
> 5995AN Kessel
> T: +31 77 462 1887
> M: +31 6 23 82 55 21
> E: [email protected]
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
--
=@@i
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm