On 6/7/07, Jukka Zitting <[EMAIL PROTECTED]> wrote:
Hi,

On 6/7/07, Andrew C. Oliver <[EMAIL PROTECTED]> wrote:
> although you can't use it (due to Apache's anti-LGPL dogma)
> 
http://blog.buni.org/blog/mbarker/Meldware/2007/06/04/Panto-0-4-release-Still-really-fast
>
> I suggest looking at the technique used by Buni's panto.

Thanks for the tip! I actually considered using a similar approach but
with quick testing it seems like the benefit of skipping bytes in the
Boyer-Moore algorithm is not too big for typical MIME boundaries that
are something like 20-40 bytes long. I guess the cache lines of
typical processors are already that big, so fetching just a single
byte within the boundary range is roughly equivalent to fetching all
the bytes especially if you have slow RAM.

I'm currently experimenting with an algorithm that does a sequential
scan of the data, but instead of doing it one byte at a time the
algorithm tries to mach 4 or 8 byte sequences depending on how long
the boundary string is.

perhaps multiple algorithms would be useful. it is often possible to
know or guestimate the size of the message.

- robert

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to