Re: [Jprogramming] Levenshtein distance

Viktor Cerovski Tue, 28 Apr 2009 07:43:18 -0700


Jan Jacobs-3 wrote:
> 
> ls,
> I modeled the edit-distance (or Levenshtein-distance
> http://en.wikipedia.org/wiki/Levenshtein_distance) function for
> strings, see below. I am not very proud of it. Hints for improvements
> on elegancy
> are welcomed (perhaps by using Sequential Machine ;:  ??).
> Thanks in advance,
> Jan.
> 
> NB. Levenshtein
> NB. y.      ~     word1
> NB. x.      ~     word2
> Levenshtein=:3 : 0
> :
>       ]C=:(i.>:#x)(<"1(0,.~xs=.i.>:#x))}
> (ys=.i.>:#y)(0)}0$~>:(#X=:x),#Y=:y   NB. init
>       ]ind=.,<"1 (}.xs),"0/ }.ys
>                             NB. relevant indices
>       step&.> ind
>       {:{:"1 C
> )
> NB. make single step in C matrix
> NB. y.      ~     current position
> step=:3 : 0
>       ]ins=.>:C{~<_1 0+y
>       ]del=.>:C{~<0 _1+y
>       ]nochxc=.(>:C{~<_1 _1+y)-(X{~<:{.y)=Y{~<:{:y
>       ]C=:(<./ins,del,nochxc)(<y)}C
> )
> d=:1!:2&2
> d 'asap'Levenshtein'aap'                        NB. 1
> d 'excused'Levenshtein'exhausted'         NB. 3
> 
> [...]
>


A fairly straightforward (in the sense of being inefficient) implementation
of 
the recursive definition of the Levenshtein distance, would be Ldist:

f=.>:@Ldist }:
mincost=: f~ <. f <. -...@=&{: + Ldist&}:
Ldist=: ([...@.<&#)`minc...@.(0 < *&#)

which can be composed into a one-liner:

Ldist1=:([...@.<&#)`((>:@$:}:)~<.(>:@$:}:)<. -...@=&{: + $:&}:)@.(0<*&#)

-- 
View this message in context: 
http://www.nabble.com/Levenshtein-distance-tp23273653s24193p23278509.html
Sent from the J Programming mailing list archive at Nabble.com.

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Levenshtein distance

Reply via email to