On Wed, Sep 05, 2001 at 07:11:37PM -0700, Justin Erenkrantz wrote:
> On Wed, Sep 05, 2001 at 06:46:45PM -0700, Brian Pane wrote:
> > Justin Erenkrantz wrote:
> > [...]
> >
> > >+/* Implements the BNDM search algorithm (as described above).
> > >+ *
> > >+ * n - the pattern to search for
> > >+ * nl - length of the pattern to search for
> > >+ * h - the string to look in
> > >+ * hl - length of the string to look for
> > >+ * t - precompiled bndm structure against the pattern
> > >+ *
> > >+ * Returns the count of character that is the first match or hl if no
> > >+ * match is found.
> > >+ */
> > >+static int bndm(const char *n, apr_size_t nl, const char *h, apr_size_t hl,
> > >+ bndm_t *t)
> > >+{
> > >+ apr_size_t skip;
> > >+ const char *he, *p, *pi;
> > >+ unsigned int *T, x, d;
> > >+
> > >+ he = h + hl;
> > >+
> > >+ T = t->T;
> > >+ x = t->x << 1;
> > >+
> > >+ pi = h - 1; /* pi: p initial */
> > >+ p = pi + nl; /* compare window right to left. point to the first char */
> > >+
> > >+ do {
> > >+ skip = nl;
> > >+ d = x;
> > >+ do {
> > >+ d = (d >> 1) & T[(unsigned char) *p--];
> > >+ if ((d & 1)) {
> > >+ if (p != pi)
> > >+ skip = p - pi;
> > >+ else
> > >+ return p - h + 1;
> > >+ }
> > >+ } while (d > 1);
> > >+ p = (pi += skip) + nl;
> > >+ } while (p < he);
> > >
> > If nl > hl (e.g., if the caller tries to search for a 5-byte pattern in
> > a 3-byte string), the first loop iteration will look at memory beyond
> > the end of the string. Should this be a while loop instead of do-while?
> > (Or is the caller responsible for avoiding this case?)
>
> Good point. I'll make it a while loop instead. If we start out past
> the end of h, we know we don't have a match. For mod_include, the
> edge cases should pick up if that is part of a tag spanning buckets.
Actually, I think the conditional should be:
while (p <= he)
Thoughts? We're scanning R->L, so p points to the end of the string.
It is possible to have "<!--#" as n (which should match). -- justin