Jan Jacobs-3 wrote:
>
> ls,
> I modeled the edit-distance (or Levenshtein-distance
> http://en.wikipedia.org/wiki/Levenshtein_distance) function for
> strings, see below. I am not very proud of it. Hints for improvements
> on elegancy
> are welcomed (perhaps by using Sequential Machine ;: ??).
> Thanks in advance,
> Jan.
>
> NB. Levenshtein
> NB. y. ~ word1
> NB. x. ~ word2
> Levenshtein=:3 : 0
> :
> ]C=:(i.>:#x)(<"1(0,.~xs=.i.>:#x))}
> (ys=.i.>:#y)(0)}0$~>:(#X=:x),#Y=:y NB. init
> ]ind=.,<"1 (}.xs),"0/ }.ys
> NB. relevant indices
> step&.> ind
> {:{:"1 C
> )
> NB. make single step in C matrix
> NB. y. ~ current position
> step=:3 : 0
> ]ins=.>:C{~<_1 0+y
> ]del=.>:C{~<0 _1+y
> ]nochxc=.(>:C{~<_1 _1+y)-(X{~<:{.y)=Y{~<:{:y
> ]C=:(<./ins,del,nochxc)(<y)}C
> )
> d=:1!:2&2
> d 'asap'Levenshtein'aap' NB. 1
> d 'excused'Levenshtein'exhausted' NB. 3
>
> [...]
>
A fairly straightforward (in the sense of being inefficient) implementation
of
the recursive definition of the Levenshtein distance, would be Ldist:
f=.>:@Ldist }:
mincost=: f~ <. f <. -...@=&{: + Ldist&}:
Ldist=: ([...@.<&#)`minc...@.(0 < *&#)
which can be composed into a one-liner:
Ldist1=:([...@.<&#)`((>:@$:}:)~<.(>:@$:}:)<. -...@=&{: + $:&}:)@.(0<*&#)
--
View this message in context:
http://www.nabble.com/Levenshtein-distance-tp23273653s24193p23278509.html
Sent from the J Programming mailing list archive at Nabble.com.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm