Re: [PATCH]: regex: PCRE2 optimisation with JIT

2020-08-13 Thread Willy TARREAU
On Thu, Aug 13, 2020 at 06:17:15PM +0100, David CARLIER wrote:
> To be honest not huge improvement but relatively constant has it is
> set once per regex compilation.

Perfect, thanks, now merged.

Willy



Re: [PATCH]: regex: PCRE2 optimisation with JIT

2020-08-13 Thread David CARLIER
To be honest not huge improvement but relatively constant has it is
set once per regex compilation.

On Thu, 13 Aug 2020 at 17:54, Willy TARREAU  wrote:
>
> On Thu, Aug 13, 2020 at 05:30:49PM +0100, David CARLIER wrote:
> > In fact the jit match does less check than the normal match and fits
> > better when the regex code had been compiled with JIT. However the
> > classic match call still works with JIT only it does more checks.
>
> OK! did you observe any gain in doing so ? Because we're replacing a
> direct call with an indirect one. I know it's a detail, but if we're
> just replacing a few "if" inside pcre_match() with an indirect call,
> it might easily be on tie!
>
> Willy



Re: [PATCH]: regex: PCRE2 optimisation with JIT

2020-08-13 Thread Willy TARREAU
On Thu, Aug 13, 2020 at 05:30:49PM +0100, David CARLIER wrote:
> In fact the jit match does less check than the normal match and fits
> better when the regex code had been compiled with JIT. However the
> classic match call still works with JIT only it does more checks.

OK! did you observe any gain in doing so ? Because we're replacing a
direct call with an indirect one. I know it's a detail, but if we're
just replacing a few "if" inside pcre_match() with an indirect call,
it might easily be on tie!

Willy



Re: [PATCH]: regex: PCRE2 optimisation with JIT

2020-08-13 Thread David CARLIER
In fact the jit match does less check than the normal match and fits
better when the regex code had been compiled with JIT. However the
classic match call still works with JIT only it does more checks.

Regards.

On Thu, 13 Aug 2020 at 17:22, Willy TARREAU  wrote:
>
> Hi David,
>
> On Thu, Aug 13, 2020 at 03:00:28PM +0100, David CARLIER wrote:
> > Subject: [PATCH] CLEANUP/MEDIUM: regex: PCRE2 use JIT match when JIT
> >  optimisation occured.
> >
> > When a regex had been succesfully compiled by the JIT pass, it is better
> >  to use the related match, thanksfully having same signature, for better
> >  performance.
>
> I'm not sure I understand well. Does this mean that right now when we
> use PCRE/JIT we compile using JIT but we don't match using it ? This
> sounds strange to me because I remember that you published numbers a
> long time ago showing a real boost with JIT. Or is there only a special
> case where JIT is used for the match ? Or only a special case where it
> is not used ?
>
> Thanks!
> Willy



Re: [PATCH]: regex: PCRE2 optimisation with JIT

2020-08-13 Thread Willy TARREAU
Hi David,

On Thu, Aug 13, 2020 at 03:00:28PM +0100, David CARLIER wrote:
> Subject: [PATCH] CLEANUP/MEDIUM: regex: PCRE2 use JIT match when JIT
>  optimisation occured.
> 
> When a regex had been succesfully compiled by the JIT pass, it is better
>  to use the related match, thanksfully having same signature, for better
>  performance.

I'm not sure I understand well. Does this mean that right now when we
use PCRE/JIT we compile using JIT but we don't match using it ? This
sounds strange to me because I remember that you published numbers a
long time ago showing a real boost with JIT. Or is there only a special
case where JIT is used for the match ? Or only a special case where it
is not used ?

Thanks!
Willy



[PATCH]: regex: PCRE2 optimisation with JIT

2020-08-13 Thread David CARLIER
Hi, here a little update proposal for the PCRE2 support.

Hope it s useful.

Cheers.
From f52735b133fff5c195a52a54623b556ecb58e22d Mon Sep 17 00:00:00 2001
From: David Carlier 
Date: Thu, 13 Aug 2020 14:53:41 +0100
Subject: [PATCH] CLEANUP/MEDIUM: regex: PCRE2 use JIT match when JIT
 optimisation occured.

When a regex had been succesfully compiled by the JIT pass, it is better
 to use the related match, thanksfully having same signature, for better
 performance.

Signed-off-by: David Carlier 
---
 include/haproxy/regex-t.h | 1 +
 include/haproxy/regex.h   | 4 ++--
 src/regex.c   | 9 ++---
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/include/haproxy/regex-t.h b/include/haproxy/regex-t.h
index f26599430..ff415e8e1 100644
--- a/include/haproxy/regex-t.h
+++ b/include/haproxy/regex-t.h
@@ -54,6 +54,7 @@ struct my_regex {
 #endif
 #endif
 #elif USE_PCRE2
+	int(*mfn)(const pcre2_code *, PCRE2_SPTR, PCRE2_SIZE, PCRE2_SIZE, uint32_t, pcre2_match_data *, pcre2_match_context *);
 	pcre2_code *reg;
 #else /* no PCRE */
 	regex_t regex;
diff --git a/include/haproxy/regex.h b/include/haproxy/regex.h
index e093051a6..2cd9573e9 100644
--- a/include/haproxy/regex.h
+++ b/include/haproxy/regex.h
@@ -62,7 +62,7 @@ static inline int regex_exec(const struct my_regex *preg, char *subject)
 	int ret;
 
 	pm = pcre2_match_data_create_from_pattern(preg->reg, NULL);
-	ret = pcre2_match(preg->reg, (PCRE2_SPTR)subject, (PCRE2_SIZE)strlen(subject),
+	ret = preg->mfn(preg->reg, (PCRE2_SPTR)subject, (PCRE2_SIZE)strlen(subject),
 		0, 0, pm, NULL);
 	pcre2_match_data_free(pm);
 	if (ret < 0)
@@ -94,7 +94,7 @@ static inline int regex_exec2(const struct my_regex *preg, char *subject, int le
 	int ret;
 
 	pm = pcre2_match_data_create_from_pattern(preg->reg, NULL);
-	ret = pcre2_match(preg->reg, (PCRE2_SPTR)subject, (PCRE2_SIZE)length,
+	ret = preg->mfn(preg->reg, (PCRE2_SPTR)subject, (PCRE2_SIZE)length,
 		0, 0, pm, NULL);
 	pcre2_match_data_free(pm);
 	if (ret < 0)
diff --git a/src/regex.c b/src/regex.c
index 95da30353..45a7e9004 100644
--- a/src/regex.c
+++ b/src/regex.c
@@ -170,7 +170,7 @@ int regex_exec_match(const struct my_regex *preg, const char *subject,
 	 */
 #ifdef USE_PCRE2
 	pm = pcre2_match_data_create_from_pattern(preg->reg, NULL);
-	ret = pcre2_match(preg->reg, (PCRE2_SPTR)subject, (PCRE2_SIZE)strlen(subject), 0, options, pm, NULL);
+	ret = preg->mfn(preg->reg, (PCRE2_SPTR)subject, (PCRE2_SIZE)strlen(subject), 0, options, pm, NULL);
 
 	if (ret < 0) {
 		pcre2_match_data_free(pm);
@@ -252,7 +252,7 @@ int regex_exec_match2(const struct my_regex *preg, char *subject, int length,
 		options |= PCRE_NOTBOL;
 #endif
 
-	/* The value returned by pcre_exec()/pcre2_match() is one more than the highest numbered
+	/* The value returned by pcre_exec()/pcre2_(jit)_match() is one more than the highest numbered
 	 * pair that has been set. For example, if two substrings have been captured,
 	 * the returned value is 3. If there are no capturing subpatterns, the return
 	 * value from a successful match is 1, indicating that just the first pair of
@@ -263,7 +263,7 @@ int regex_exec_match2(const struct my_regex *preg, char *subject, int length,
 	 */
 #ifdef USE_PCRE2
 	pm = pcre2_match_data_create_from_pattern(preg->reg, NULL);
-	ret = pcre2_match(preg->reg, (PCRE2_SPTR)subject, (PCRE2_SIZE)length, 0, options, pm, NULL);
+	ret = preg->mfn(preg->reg, (PCRE2_SPTR)subject, (PCRE2_SIZE)length, 0, options, pm, NULL);
 
 	if (ret < 0) {
 		pcre2_match_data_free(pm);
@@ -365,6 +365,7 @@ struct my_regex *regex_comp(const char *str, int cs, int cap, char **err)
 		goto out_fail_alloc;
 	}
 
+	regex->mfn = _match;
 #if defined(USE_PCRE2_JIT)
 	jit = pcre2_jit_compile(regex->reg, PCRE2_JIT_COMPLETE);
 	/*
@@ -375,6 +376,8 @@ struct my_regex *regex_comp(const char *str, int cs, int cap, char **err)
 		pcre2_code_free(regex->reg);
 		memprintf(err, "regex '%s' jit compilation failed", str);
 		goto out_fail_alloc;
+	} else {
+		regex->mfn = _jit_match;
 	}
 #endif
 
-- 
2.28.0



Re: Haproxy 1.8.26-1~bpo9+1

2020-08-13 Thread Erwin Schliske
> The patch was for the 2.0. It must be adapted for the 1.8. But, it is not
> necessary because the bug is now fixed in 2.0 and 1.8 :
>
>   * 2.0 : http://git.haproxy.org/?p=haproxy-2.0.git;a=commit;h=307f31ec
>   * 1.8 : http://git.haproxy.org/?p=haproxy-1.8.git;a=commit;h=179d316c


Thanks. I need the patch until 1.8.27 is released. I found it yesterday and
tested it successfully. The observed problem is now solved.

Best regards,
Erwin Schliske