Edi Weitz <[EMAIL PROTECTED]> writes:
> I was aware of Boyer-Moore although I'm not really familiar with
> it. I'll try to implement it this weekend and let the list know
> whether there's a significant difference. (I guess you first
> have to check the size of the strings in order to decide if it
> is worth building the translation tables.)
Below is my implementation of Boyer-Moore-Horspool algorithm. It
should be better than Boyer-Moore in practise.
If you really need speed, there is algorithm called
Boyer-Moore-Horspool-Hume-Sunday. It is 20% better than
Boyer-Moore-Horspool in practice.
;;;
;;; Boyer-Moore-Horspool
;;;
;;; m - pattern size
;;; n - text size
;;;
;;; Worst case time: O(mn)
;;; Average time: O(n/m)
;;; In practice better than Boyer-Moore and Knuth-Morris-Pratt
;;; (when number of characters & m are not wery small)
;;;
(defun bmh-search (pattern text)
(declare (optimize (speed 3) (safety 0) (debug 0)
(compilation-speed 0) (space 0))
(type simple-base-string pattern text))
(check-type pattern simple-base-string)
(check-type text simple-base-string)
(let* ((n (length text))
(m (length pattern))
(skip (make-array char-code-limit
:element-type 'fixnum
:initial-element m)))
(declare (fixnum n m))
(when (= 0 m)
(return-from bmh-search 0))
(loop for k fixnum below m do
(setf (aref skip (char-code (aref pattern k))) (- m k 1)))
(loop for k fixnum = (1- m) then
(setf k (+ k (max 1 (aref skip (char-code (aref text k))))))
while (< k n)
do
(loop for j fixnum downfrom (1- m)
for i fixnum downfrom k
while (and (<= 0 j)
(char= (aref text i))
(aref pattern j))
finally (if (= j -1)
(return-from bmh-search (1+ i)))))))
;;;; eof
Kimmo
--
((lambda (integer) ;; http://www.iki.fi/kt/
(coerce (loop for i upfrom 0 by 8 below (integer-length integer)
collect (code-char (ldb (byte 8 i) integer))) 'string))
100291759904362517251920937783274743691485481194069255743433035)