Re: [PATCH] Boyer-Moore extension for textsearch infrastructure

Thomas Graf Sat, 20 Aug 2005 07:58:10 -0700

Pablo,

* Pablo Neira <[EMAIL PROTECTED]> 2005-08-20 16:35
> Attached the implementation of the Boyer-Moore[1] string search 
> algorithm for brand new textsearch infrastructure.


Are you aware that this implementation is only correct under
the assumption that a pattern is never spread over mutliple
blocks? You'll have to add a buffer of patlen size and
continously fill it with data from get_next_block() to make
sure it works for non-linear data.

I'm not saying you have to add this but it must be documented
somewhere. This algorithm is still very useful if comparisons
of non-linear data is not a requirement.

The additional badshift jumps come at the cost that bm requires
random access over a range of patlen bytes while kmp is fine
with sequential access.

> @Thomas: I did some little benchmarks[2] to compare Knuth-Pratt-Morris 
> vs Boyer-Moore. I've used the iptables string match to add rules for the 
> packet classification.

Thank you, it would be interesting to see how they compare once
you've either added the above mentioned buffer or use a naive
or kmp match at the borders of each block.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Boyer-Moore extension for textsearch infrastructure

Reply via email to