On Fri, 21 Nov 2014 19:14:23 +0100
Willy Tarreau <[email protected]> wrote:
> On Fri, Nov 21, 2014 at 06:33:46PM +0100, konrad rzentarzewski wrote:
> > On Fri, Nov 21, 2014 at 06:23:43PM +0100, Willy Tarreau wrote:
> > > There were changes to the acl part but in theory this should not
> > > impact performance, especially not like this. Are you sure you
> > > compiled 1.5 the same way as you did for 1.4 ? Please run
> > > "haproxy -vv" for both versions and output the result. It's possible
> > > that you used libpcre for 1.4 and not for 1.5 for example. PCRE is
> > > extremely fast in general and not using it could easily cause what
> > > you're observing.
> >
> > yes, both use PCRE, difference is drammatic - 1.4 uses 25-30% CPU with
> > nbproc 1 and 1.5 maxes out all cores with nbproc 8.
>
> That's a big change. We can see huge variations between regex libs. I
> remember playing at exploiting some regex flaws in the past and managing
> to get some very simple regex literally take *days* to execute. But if
> you're running with the same lib, there certainly is something odd. I
> could imagine a loss of a few percent due to some code changes, but not
> something that big. Are you certain the same libpcre is used in both
> cases ? Running "ldd" on haproxy will tell you if it's dynamically
> linked or not. For example if version 6.6 comes from a static lib and
> is only used for 1.5 it can make a difference. Would you be interested
> in me providing you with a static binary to test ?
>
> > this pinpoints to this exact acl i've included - if i hash it out
> > everything works as normal (same load as 1.4).
>
> OK, I'll ask Thierry to take a look at it, as he knows this part better
> than anyone else. At least your rules don't show anything complex! Just
> out of curiosity, how many frontends do you have which make use of these
> files ? I'm just realizing that if we have a bug causing all instances
> of the same file to be concatenated for all frontends, it could result
> in a much larger regex list and may explain things a bit.
>
Hello, I found a bug during the load of the pattern files.
Unfortunately a control is odd, and the patterns are catenated.
A quick fix for production is to reload the pattern files via the
HAProxy's control socket.
prompt
clear acl <filename>
for each pattern entry:
add acl <filename>
Otherwise, I join a fix. Can you try it ?
Thierry
> > $ haproxy -vv
> > HA-Proxy version 1.4.25 2014/03/27
> > Copyright 2000-2014 Willy Tarreau <[email protected]>
> >
> > Build options :
> > TARGET = linux26
> > CPU = generic
> > CC = gcc
> > CFLAGS = -O2 -g -fno-strict-aliasing
> > OPTIONS = USE_REGPARM=1 USE_PCRE=1
> (...)
> OK quite simple.
>
> > $ haproxy -vv
> > HA-Proxy version 1.5.8 2014/10/31
> > Copyright 2000-2014 Willy Tarreau <[email protected]>
> >
> > Build options :
> > TARGET = linux26
> > CPU = generic
> > CC = gcc
> > CFLAGS = -O2 -g -fno-strict-aliasing
> > OPTIONS = USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_PCRE=1
>
> Here as well.
>
> > Default settings :
> > maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200
> >
> > Encrypted password support via crypt(3): yes
> > Built with zlib version : 1.2.3
> > Compression algorithms supported : identity, deflate, gzip
> > Built with OpenSSL version : OpenSSL 1.0.1j-fips 15 Oct 2014
> > Running on OpenSSL version : OpenSSL 1.0.1j-fips 15 Oct 2014
> > OpenSSL library supports TLS extensions : yes
> > OpenSSL library supports SNI : yes
> > OpenSSL library supports prefer-server-ciphers : yes
> > Built with PCRE version : 6.6 06-Feb-2006
>
> Ah you beat me on this one, my oldest one is 7.7 now :-)
>
> Best regards,
> Willy
>
>
>From 196fc042300feba41f5b3ca7dbd2d8a8113c1141 Mon Sep 17 00:00:00 2001
From: Thierry FOURNIER <[email protected]>
Date: Mon, 24 Nov 2014 11:14:42 +0100
Subject: [PATCH] BUG/MEDIUM: pattern: don't load more than once a pattern
list.
A memory optimization can use the same pattern expression for many
equal pattern list (same parse method, index method and index_smp
method).
The pattern expression is returned by "pattern_new_expr", but this
function dont indicate if the returned pattern is already in use.
So, the caller function reload the list of patterns in addition with
the existing patterns. This behavior is not a problem with tree indexed
pattern, but it grows the lists indexed patterns.
This fix add a "reuse" flag in return of the function "pattern_new_expr".
If the flag is set, I suppose that the patterns are already loaded.
---
include/proto/pattern.h | 3 ++-
src/acl.c | 2 +-
src/pattern.c | 22 ++++++++++++++++++++--
3 files changed, 23 insertions(+), 4 deletions(-)
diff --git a/include/proto/pattern.h b/include/proto/pattern.h
index 4a969ac..7855474 100644
--- a/include/proto/pattern.h
+++ b/include/proto/pattern.h
@@ -206,7 +206,8 @@ int pattern_read_from_file(struct pattern_head *head, unsigned int refflags, con
*/
void pattern_init_expr(struct pattern_expr *expr);
struct pattern_expr *pattern_lookup_expr(struct pattern_head *head, struct pat_ref *ref);
-struct pattern_expr *pattern_new_expr(struct pattern_head *head, struct pat_ref *ref, char **err);
+struct pattern_expr *pattern_new_expr(struct pattern_head *head, struct pat_ref *ref,
+ char **err, int *reuse);
struct sample_storage **pattern_find_smp(struct pattern_expr *expr, struct pat_ref_elt *elt);
int pattern_delete(struct pattern_expr *expr, struct pat_ref_elt *ref);
diff --git a/src/acl.c b/src/acl.c
index 8f3fd9e..d8b3000 100644
--- a/src/acl.c
+++ b/src/acl.c
@@ -532,7 +532,7 @@ struct acl_expr *parse_acl_expr(const char **args, char **err, struct arg_list *
}
/* Create new pattern expression associated to this reference. */
- pattern_expr = pattern_new_expr(&expr->pat, ref, err);
+ pattern_expr = pattern_new_expr(&expr->pat, ref, err, NULL);
if (!pattern_expr)
goto out_free_expr;
diff --git a/src/pattern.c b/src/pattern.c
index c63365d..20547f9 100644
--- a/src/pattern.c
+++ b/src/pattern.c
@@ -1855,12 +1855,19 @@ struct pattern_expr *pattern_lookup_expr(struct pattern_head *head, struct pat_r
* <ref> can be NULL. If an error is occured, the function returns NULL and
* <err> is filled. Otherwise, the function returns new pattern_expr linked
* with <head> and <ref>.
+ *
+ * The returned value can be a alredy filled pattern list, in this case the
+ * flag <reuse> is set.
*/
-struct pattern_expr *pattern_new_expr(struct pattern_head *head, struct pat_ref *ref, char **err)
+struct pattern_expr *pattern_new_expr(struct pattern_head *head, struct pat_ref *ref,
+ char **err, int *reuse)
{
struct pattern_expr *expr;
struct pattern_expr_list *list;
+ if (reuse)
+ *reuse = 0;
+
/* Memory and initialization of the chain element. */
list = malloc(sizeof(*list));
if (!list) {
@@ -1915,6 +1922,8 @@ struct pattern_expr *pattern_new_expr(struct pattern_head *head, struct pat_ref
* with ref and we must not free it.
*/
list->do_free = 0;
+ if (reuse)
+ *reuse = 1;
}
/* The new list element reference the pattern_expr. */
@@ -2087,6 +2096,7 @@ int pattern_read_from_file(struct pattern_head *head, unsigned int refflags,
struct pat_ref *ref;
struct pattern_expr *expr;
struct pat_ref_elt *elt;
+ int reuse;
/* Lookup for the existing reference. */
ref = pat_ref_lookup(filename);
@@ -2161,12 +2171,20 @@ int pattern_read_from_file(struct pattern_head *head, unsigned int refflags,
*/
expr = pattern_lookup_expr(head, ref);
if (!expr || (expr->mflags != patflags)) {
- expr = pattern_new_expr(head, ref, err);
+ expr = pattern_new_expr(head, ref, err, &reuse);
if (!expr)
return 0;
expr->mflags = patflags;
}
+ /* The returned expression may be not empty, because the function
+ * "pattern_new_expr" lookup for similar pattern list and can
+ * reuse a already filled pattern list. In this case, we can not
+ * reload the patterns.
+ */
+ if (reuse)
+ return 1;
+
/* Load reference content in the pattern expression. */
list_for_each_entry(elt, &ref->head, list) {
if (!pat_ref_push(elt, expr, patflags, err)) {
--
1.9.4