Hi,

sorry it was long, but I didn't have time to review it earlier.

On Sun, Jan 13, 2013 at 03:00:42PM +0900, Hiroaki Nakamura wrote:
> Hi there.
> 
> I saw the mail "Re: PCRE >= 8.20 JIT support in haproxy >= 1.5.x?" at
> 2011-12-27,
> and give it a try.
> 
> This is a patch for using PCRE JIT in acl.
> Could you review it?
> 
> I notice regex are used in other places, but they are more complicated to 
> modify
> to use PCRE APIs. So I focused to acl in the first try.
> 
> 
> BTW, I made a simple benchmark program for PCRE JIT beforehand.
> https://github.com/hnakamur/pcre-jit-benchmark
> 
> I read the manual for PCRE JIT
> http://www.manpagez.com/man/3/pcrejit/
> 
> and wrote my benchmark program.
> https://github.com/hnakamur/pcre-jit-benchmark/blob/master/test-pcre.c

Thanks for your patch. I'm having a few comments below :

> diff --git a/Makefile b/Makefile
> index fb6ce98..b459f06 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -150,6 +150,9 @@ ADDLIB =
>  DEFINE =
>  SILENT_DEFINE =
>  
> +ifeq ($(_LIB),)
> +_LIB         := lib
> +endif
>  
>  #### CPU dependant optimizations
>  # Some CFLAGS are set by default depending on the target CPU. Those flags 
> only
> @@ -522,7 +525,7 @@ PCREDIR           := $(shell pcre-config --prefix 
> 2>/dev/null || echo /usr/local)
>  endif
>  ifeq ($(USE_STATIC_PCRE),)
>  OPTIONS_CFLAGS  += -DUSE_PCRE $(if $(PCREDIR),-I$(PCREDIR)/include)
> -OPTIONS_LDFLAGS += $(if $(PCREDIR),-L$(PCREDIR)/lib) -lpcreposix -lpcre
> +OPTIONS_LDFLAGS += $(if $(PCREDIR),-L$(PCREDIR)/$(_LIB)) -lpcreposix -lpcre
>  endif
>  BUILD_OPTIONS   += $(call ignore_implicit,USE_PCRE)
>  endif
> @@ -534,7 +537,7 @@ ifeq ($(PCREDIR),)
>  PCREDIR         := $(shell pcre-config --prefix 2>/dev/null || echo 
> /usr/local)
>  endif
>  OPTIONS_CFLAGS  += -DUSE_PCRE $(if $(PCREDIR),-I$(PCREDIR)/include)
> -OPTIONS_LDFLAGS += $(if $(PCREDIR),-L$(PCREDIR)/lib) -Wl,-Bstatic 
> -lpcreposix -lpcre -Wl,-Bdynamic
> +OPTIONS_LDFLAGS += $(if $(PCREDIR),-L$(PCREDIR)/$(_LIB)) -Wl,-Bstatic 
> -lpcreposix -lpcre -Wl,-Bdynamic
>  BUILD_OPTIONS   += $(call ignore_implicit,USE_STATIC_PCRE)
>  endif

I have just fixed this part, I know why you had to do this, it's because of
the lib64 mess. Now we have PCRE_INC and PCRE_LIB which can be set separately,
if required at all.
 
> diff --git a/include/types/acl.h b/include/types/acl.h
> index bf5537f..15fcdb3 100644
> --- a/include/types/acl.h
> +++ b/include/types/acl.h
> @@ -213,7 +213,12 @@ struct acl_pattern {
>       union {
>               void *ptr;              /* any data */
>               char *str;              /* any string  */
> +#ifdef USE_PCRE
> +             pcre *regex;            /* a compiled regex */
> +             pcre_extra *reg_extra;  /* a study result of a compiled regex */
> +#else
>               regex_t *reg;           /* a compiled regex */
> +#endif
>       } ptr;                          /* indirect values, allocated */
>       void(*freeptrbuf)(void *ptr);   /* a destructor able to free objects 
> from the ptr */
>       int len;                        /* data length when required  */

I think you can continue to call it "reg" instead of "regex", I see
no particular reason for changing it and not the other one.

> diff --git a/src/acl.c b/src/acl.c
> index cba89ab..4461944 100644
> --- a/src/acl.c
> +++ b/src/acl.c
> @@ -900,12 +904,37 @@ acl_parse_strcat(const char **text, struct acl_pattern 
> *pattern, int *opaque, ch
>  /* Free data allocated by acl_parse_reg */
>  static void acl_free_reg(void *ptr)
>  {
> +#ifdef USE_PCRE
> +     pcre_free_study((pcre_extra *)(ptr + sizeof(pcre *)));
> +     pcre_free((pcre *)ptr);
> +#else
>       regfree((regex_t *)ptr);
> +#endif

I didn't understand the magic above. Where does the ptr+sizeof(pcre *) come
from ? Is it the way pcre internaly stores its JIT data ? Then at least I
think a comment explaining that is required. I'm a bit confused.

>  /* Parse a regex. It is allocated. */
>  int acl_parse_reg(const char **text, struct acl_pattern *pattern, int 
> *opaque, char **err)
>  {
> +#ifdef USE_PCRE
> +     pcre *preg;
> +     pcre_extra *preg_extra;
> +     int icase;
> +
> +     icase = (pattern->flags & ACL_PAT_F_IGNORE_CASE) ? PCRE_CASELESS : 0;
> +
> +     preg = pcre_compile(*text, PCRE_NO_AUTO_CAPTURE | icase, NULL, NULL, 
> NULL);
> +     if (!preg)
> +             return 0;
> +
> +        preg_extra = pcre_study(preg, PCRE_STUDY_JIT_COMPILE, NULL);

Warning, spaces instead of a tab above.

Also, I think this will not build due to #ifdef USE_PCRE, I think you
need to make it rely on the other define you found for the JIT case,
or you could as well have a USE_PCRE_JIT define to enable JIT, it might
even be easier to handle.

> +     if (!preg_extra) {
> +             pcre_free(preg_extra);
> +             return 0;
> +     }
> +
> +     pattern->ptr.regex = preg;
> +     pattern->ptr.reg_extra = preg_extra;
> +#else
>       regex_t *preg;
>       int icase;
>  
> @@ -924,6 +953,7 @@ int acl_parse_reg(const char **text, struct acl_pattern 
> *pattern, int *opaque, c
>       }
>  
>       pattern->ptr.reg = preg;
> +#endif
>       pattern->freeptrbuf = &acl_free_reg;
>       return 1;
>  }

>From what I'm seeing above, I suspect it might be easier to replace the

   regex_t *reg

with this :

   struct jit_regex *reg;

where jit_regex is defined this way :

   struct jit_regex {
        pcre *reg;
        pcre_extra *extra;
   };

Then you'll be able to write a wrapper for regcomp() and regfree() that uses
such a struct instead of a regex_t. And that could probably make it easier
to generalize JIT everywhere else in the code.

Regards,
Willy


Reply via email to