Here are IMO some optimizations:

offsets for indices of neighbors advance :   B
sentence for initial matrix C much shorter
'match' matrix of same size, in advance  :   M
no boxing of indices

levdist=: 3 : 0
:
  B=: 3 2$_1 0 0 _1 _1 _1
  M=: 0,0,. x =/ y
  C=: (ys=.i.>:#y),,.}.xs=.i.>:#x
  stap"1 xs ,"0/&}. ys
  {:, C
)

stap=: 3 : 0
   u=. >: <./ (0 0,- M {~ <y) + C{~;/B +"1/ y
   C=: u (<y)}C
)


   'excused' levdist 'exhausted'
3

   'levenshtein' levdist 'malenstein'
4




Hallo Jan Jacobs, je schreef op 28-04-09 11:42:
> ls,
> I modeled the edit-distance (or Levenshtein-distance
> http://en.wikipedia.org/wiki/Levenshtein_distance) function for
> strings, see below. I am not very proud of it. Hints for improvements
> on elegancy
> are welcomed (perhaps by using Sequential Machine ;:  ??).
> Thanks in advance,
> Jan.
>
> NB. Levenshtein
> NB. y.      ~     word1
> NB. x.      ~     word2
> Levenshtein=:3 : 0
> :
>       ]C=:(i.>:#x)(<"1(0,.~xs=.i.>:#x))}
> (ys=.i.>:#y)(0)}0$~>:(#X=:x),#Y=:y   NB. init
>       ]ind=.,<"1 (}.xs),"0/ }.ys
>                             NB. relevant indices
>       step&.> ind
>       {:{:"1 C
> )
> NB. make single step in C matrix
> NB. y.      ~     current position
> step=:3 : 0
>       ]ins=.>:C{~<_1 0+y
>       ]del=.>:C{~<0 _1+y
>       ]nochxc=.(>:C{~<_1 _1+y)-(X{~<:{.y)=Y{~<:{:y
>       ]C=:(<./ins,del,nochxc)(<y)}C
> )
> d=:1!:2&2
> d 'asap'Levenshtein'aap'                        NB. 1
> d 'excused'Levenshtein'exhausted'         NB. 3
>
>
>
> Jan Jacobs
> Esdoornstraat 33
> 5995AN Kessel
> T: +31 77 462 1887
> M: +31 6 23 82 55 21
> E: [email protected]
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>   

-- 
=@@i

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to