In http://www.jsoftware.com/jwiki/Essays/Longest%20Common%20Subsequence lcs is 
described which gives the length of the longest
common subsequence. 
The following solution is quite a bit more efficient.

   'X Y'=: <@('ACTG'{~ 4 ?@$~ ])"0 [200 200

   X lcs Y
127

   {: ((>./\@:+ |.!.0)>. ])/|.  X=/Y
127
   
   ts'X lcs Y'
0.099401083 7474688

   ts'{: ((>./\@:+ |.!.0)>. ])/|.  X=/Y'
0.00057205218 148864

0.099401083 7474688 % 0.00057205218 148864
173.76227 50.211522

It scales rather well:

      'X Y'=: <@('ACTG'{~ 4 ?@$~ ])"0 [2000 2000

lcsB=: 4 : '{: ((>./\@:+ |.!.0)>. ])/|.  x=/y'

   ts' X lcsB Y'
0.040288245 8485248

      'X Y'=: <@('ACTG'{~ 4 ?@$~ ])"0 [20000 20000

   ts'  X lcsB Y '
4.1813038 1.0751246e9

So it can be easily used for script comparison.

The longest common subsequence is given by:

'p q'=: 'armageddon in the desert';'armed and dangerous'        NB.
http://www.jsoftware.com/pipermail/general/2003-March/015449.html

       (q;p)(];{~L:0"0 1) |:>({:"1 {.L:0@{.@(~.@:({:"1) /:~ {:"1 </. }:"1)/. 
}:"1) (5&$. ,.~ [: (,+/)"(1) 4&$.) $.   M * ((>./\@:+
|.!.0)>. ])/\.&.|.  M=:  q=/p
+---------------------------+-----------+-----------+
|0 1 2 3 4  5  7  9 10 14 15|armed n der|armed n der|
|0 1 2 5 6 10 12 13 18 19 22|           |           |
+---------------------------+-----------+-----------+


R.E. Boss


(Add your info to http://www.jsoftware.com/jwiki/Community/Demographics )




----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to