I remember a lengthy LCS discussion, and the solutions for the most
part used the regex engine.  I also looked on CPAN, but couldn't find
a canonical LCS module.

Has anyone developed an implementation of the well-known bitstring
algorithm?  Basically you convert your data strings to bitstrings, AND
the two, and look for the longest match.  Then you rotate the shorter
data string by one, and repeat the test.  Repeat for the number of
bits in the shorter data string.  I want to do it, but just wanted to
check if it had come up before on FWP.

There are even better algorithms to do this.  I found them with
Google; for instance the Goeman & Clausen 1999 "A New Practical Linear
Space Algorithm for the LCS Problem" paper looks interesting but is
beyond what I remember of my college math.  It describes a 
O(ns + min(mp, (m log(m)) + p(n-p))) time, linear space algorithm and a
O(ns + min(mp, p(n-p))) time, O(ns) space algorithm, and claims the
latter to be the fastest known such algorithm (m = size of string A, 
n = size of string B, n>= m, s = alphabet size, p = length of LCS).
I'm pretty sure implementations of the Goeman & Clausen algorithms
haven't been done in Perl yet.

-- 
Teodor Zlatanov <[EMAIL PROTECTED]>
"Brevis oratio penetrat colos, longa potatio evacuat ciphos." -Rabelais

Reply via email to