Re: [PATCH]: MEDIUM: pool little FreeBSD support improvement.
On Thu, Nov 25, 2021 at 04:29:55PM +, David CARLIER wrote: > Ok I applied your suggestions and move back the malloc_trim/mallinfo > part as it was before. Thanks, now merged! Willy
Re: [PATCH]: MEDIUM: pool little FreeBSD support improvement.
Ok I applied your suggestions and move back the malloc_trim/mallinfo part as it was before. On Thu, 25 Nov 2021 at 14:37, Willy Tarreau wrote: > > On Thu, Nov 25, 2021 at 01:19:39PM +, David CARLIER wrote: > > Here a patchset instead :) > > Thanks! > > I've reviewed it, I'm having some comments below: > > > From e8daa477b53a43ab39113cf0e9c43d9bbda1e9a9 Mon Sep 17 00:00:00 2001 > > From: David Carlier > > Date: Thu, 25 Nov 2021 10:26:50 + > > Subject: [PATCH 1/3] MINOR: pool: refactoring to make it available for other > > systems/allocators. > > > > Actually it's focusing on linux/glibc, we re trying to expand it for other > > possible systems and allocators combinations. > > --- > > include/haproxy/compat.h | 9 +++-- > > src/pool.c | 38 +++--- > > 2 files changed, 14 insertions(+), 33 deletions(-) > > > > diff --git a/include/haproxy/compat.h b/include/haproxy/compat.h > > index 25b15a1f0..b801dac08 100644 > > --- a/include/haproxy/compat.h > > +++ b/include/haproxy/compat.h > > @@ -263,9 +263,14 @@ typedef struct { } empty_t; > > #endif > > > > /* glibc 2.33 provides mallinfo2() that overcomes mallinfo()'s type > > limitations */ > > -#if (defined(__GNU_LIBRARY__) && (__GLIBC__ > 2 || __GLIBC__ == 2 && > > __GLIBC_MINOR__ >= 33)) > > +#if defined(__GNU_LIBRARY__) > > #include > > -#define HA_HAVE_MALLINFO2 > > +#if (__GLIBC__ > 2 || __GLIBC__ == 2 && __GLIBC_MINOR__ >= 33) > > +#define HA_MALLINFO_API mallinfo2 > > +#else > > +#define HA_MALLINFO_API mallinfo > > +#endif > > +#define HA_HAVE_MALLINFO > > #endif > > > > /* FreeBSD also has malloc_usable_size() but it requires malloc_np.h */ > > diff --git a/src/pool.c b/src/pool.c > > index af46b4469..16ef76cf0 100644 > > --- a/src/pool.c > > +++ b/src/pool.c > > @@ -42,14 +42,15 @@ int mem_poison_byte = -1; > > static int mem_fail_rate = 0; > > #endif > > > > -#if defined(HA_HAVE_MALLOC_TRIM) > > static int using_libc_allocator = 0; > > > > /* ask the allocator to trim memory pools */ > > static void trim_all_pools(void) > > { > > +#ifdef HAVE_MALLOC_TRIM > > if (using_libc_allocator) > > malloc_trim(0); > > +#endif > > } > > > > /* check if we're using the same allocator as the one that provides > > @@ -60,48 +61,23 @@ static void trim_all_pools(void) > > */ > > static void detect_allocator(void) > > { > > -#ifdef HA_HAVE_MALLINFO2 > > - struct mallinfo2 mi1, mi2; > > -#else > > - struct mallinfo mi1, mi2; > > -#endif > > +#if defined(HA_HAVE_MALLINFO) > > + struct HA_MALLINFO_API mi1, mi2; > > void *ptr; > > > > -#ifdef HA_HAVE_MALLINFO2 > > - mi1 = mallinfo2(); > > -#else > > - mi1 = mallinfo(); > > -#endif > > + mi1 = HA_MALLINFO_API(); > > ptr = DISGUISE(malloc(1)); > > -#ifdef HA_HAVE_MALLINFO2 > > - mi2 = mallinfo2(); > > -#else > > - mi2 = mallinfo(); > > -#endif > > + mi2 = HA_MALLINFO_API(); > > free(DISGUISE(ptr)); > > > > using_libc_allocator = !!memcmp(&mi1, &mi2, sizeof(mi1)); > > +#endif > > } > > I find it unwelcome to use a single macro for the struct name and > the function. It just turns out that they are the same but they're > for different object types, and if one day we have mallinfo3() that > works on struct mallinfo2, we're doomed. I hope it will never come to such things, sounds terrible to me :) even tough I have hard time to imagine a situation like this. And the gain in legibility > is not that great overall, I'd rather keep them with the ugly ifdefs > here. If you prefer, just write two blocks in the function, one for > mallinfo2 and one for the other. That's what I did the first time > but wasn't much convinced about the benefit, but at this point it's > a matter of taste. > > If you really want to have it this way, otherwise use two different > macros, e.g.: > #define HA_MALLINFO_TYPE struct mallinfo > #define HA_MALLINFO_FUNC(x) mallinfo(x) > > But again this adds abstraction that doesn't help much, especially in > tricky code which is written to diagnose certain things. > > > From c0f5cf6f935392e1ea69a84aae104de5ed06e68c Mon Sep 17 00:00:00 2001 > > From: David Carlier > > Date: Thu, 25 Nov 2021 13:00:54 + > > Subject: [PATCH 2/3] MEDIUM: pool : detecting jemalloc usage at runtime. > > > > Here we're just looking up if the mallctl's jemalloc API symbol is present. > > --- > > src/pool.c | 26 +++--- > > 1 file changed, 19 insertions(+), 7 deletions(-) > > > > diff --git a/src/pool.c b/src/pool.c > > index 16ef76cf0..65c037911 100644 > > --- a/src/pool.c > > +++ b/src/pool.c > > @@ -43,6 +43,7 @@ static int mem_fail_rate = 0; > > #endif > > > > static int using_libc_allocator = 0; > > +static int(*my_mallctl)(const char *, void *, size_t *, void *, size_t) = > > NULL; > > > > /* ask the allocator to trim memory pools */ > > static void trim_all_pools(void) > > @@ -61,17 +62,28 @@ static void trim_all_pools(void) > > */ >
Re: [PATCH]: MEDIUM: pool little FreeBSD support improvement.
On Thu, Nov 25, 2021 at 01:19:39PM +, David CARLIER wrote: > Here a patchset instead :) Thanks! I've reviewed it, I'm having some comments below: > From e8daa477b53a43ab39113cf0e9c43d9bbda1e9a9 Mon Sep 17 00:00:00 2001 > From: David Carlier > Date: Thu, 25 Nov 2021 10:26:50 + > Subject: [PATCH 1/3] MINOR: pool: refactoring to make it available for other > systems/allocators. > > Actually it's focusing on linux/glibc, we re trying to expand it for other > possible systems and allocators combinations. > --- > include/haproxy/compat.h | 9 +++-- > src/pool.c | 38 +++--- > 2 files changed, 14 insertions(+), 33 deletions(-) > > diff --git a/include/haproxy/compat.h b/include/haproxy/compat.h > index 25b15a1f0..b801dac08 100644 > --- a/include/haproxy/compat.h > +++ b/include/haproxy/compat.h > @@ -263,9 +263,14 @@ typedef struct { } empty_t; > #endif > > /* glibc 2.33 provides mallinfo2() that overcomes mallinfo()'s type > limitations */ > -#if (defined(__GNU_LIBRARY__) && (__GLIBC__ > 2 || __GLIBC__ == 2 && > __GLIBC_MINOR__ >= 33)) > +#if defined(__GNU_LIBRARY__) > #include > -#define HA_HAVE_MALLINFO2 > +#if (__GLIBC__ > 2 || __GLIBC__ == 2 && __GLIBC_MINOR__ >= 33) > +#define HA_MALLINFO_API mallinfo2 > +#else > +#define HA_MALLINFO_API mallinfo > +#endif > +#define HA_HAVE_MALLINFO > #endif > > /* FreeBSD also has malloc_usable_size() but it requires malloc_np.h */ > diff --git a/src/pool.c b/src/pool.c > index af46b4469..16ef76cf0 100644 > --- a/src/pool.c > +++ b/src/pool.c > @@ -42,14 +42,15 @@ int mem_poison_byte = -1; > static int mem_fail_rate = 0; > #endif > > -#if defined(HA_HAVE_MALLOC_TRIM) > static int using_libc_allocator = 0; > > /* ask the allocator to trim memory pools */ > static void trim_all_pools(void) > { > +#ifdef HAVE_MALLOC_TRIM > if (using_libc_allocator) > malloc_trim(0); > +#endif > } > > /* check if we're using the same allocator as the one that provides > @@ -60,48 +61,23 @@ static void trim_all_pools(void) > */ > static void detect_allocator(void) > { > -#ifdef HA_HAVE_MALLINFO2 > - struct mallinfo2 mi1, mi2; > -#else > - struct mallinfo mi1, mi2; > -#endif > +#if defined(HA_HAVE_MALLINFO) > + struct HA_MALLINFO_API mi1, mi2; > void *ptr; > > -#ifdef HA_HAVE_MALLINFO2 > - mi1 = mallinfo2(); > -#else > - mi1 = mallinfo(); > -#endif > + mi1 = HA_MALLINFO_API(); > ptr = DISGUISE(malloc(1)); > -#ifdef HA_HAVE_MALLINFO2 > - mi2 = mallinfo2(); > -#else > - mi2 = mallinfo(); > -#endif > + mi2 = HA_MALLINFO_API(); > free(DISGUISE(ptr)); > > using_libc_allocator = !!memcmp(&mi1, &mi2, sizeof(mi1)); > +#endif > } I find it unwelcome to use a single macro for the struct name and the function. It just turns out that they are the same but they're for different object types, and if one day we have mallinfo3() that works on struct mallinfo2, we're doomed. And the gain in legibility is not that great overall, I'd rather keep them with the ugly ifdefs here. If you prefer, just write two blocks in the function, one for mallinfo2 and one for the other. That's what I did the first time but wasn't much convinced about the benefit, but at this point it's a matter of taste. If you really want to have it this way, otherwise use two different macros, e.g.: #define HA_MALLINFO_TYPE struct mallinfo #define HA_MALLINFO_FUNC(x) mallinfo(x) But again this adds abstraction that doesn't help much, especially in tricky code which is written to diagnose certain things. > From c0f5cf6f935392e1ea69a84aae104de5ed06e68c Mon Sep 17 00:00:00 2001 > From: David Carlier > Date: Thu, 25 Nov 2021 13:00:54 + > Subject: [PATCH 2/3] MEDIUM: pool : detecting jemalloc usage at runtime. > > Here we're just looking up if the mallctl's jemalloc API symbol is present. > --- > src/pool.c | 26 +++--- > 1 file changed, 19 insertions(+), 7 deletions(-) > > diff --git a/src/pool.c b/src/pool.c > index 16ef76cf0..65c037911 100644 > --- a/src/pool.c > +++ b/src/pool.c > @@ -43,6 +43,7 @@ static int mem_fail_rate = 0; > #endif > > static int using_libc_allocator = 0; > +static int(*my_mallctl)(const char *, void *, size_t *, void *, size_t) = > NULL; > > /* ask the allocator to trim memory pools */ > static void trim_all_pools(void) > @@ -61,17 +62,28 @@ static void trim_all_pools(void) > */ > static void detect_allocator(void) > { > + extern int mallctl(const char *, void *, size_t *, void *, size_t) > __attribute__((weak)); > + > + my_mallctl = mallctl; > + > + if (!my_mallctl) > + my_mallctl = get_sym_curr_addr("mallctl"); > + > + if (!my_mallctl) { > #if defined(HA_HAVE_MALLINFO) > - struct HA_MALLINFO_API mi1, mi2; > - void *ptr; > + struct HA_MALLINFO_API mi1, mi2; > + void *ptr; > > - mi1 = HA_MALLINFO_API(); > - ptr = DISGUI
Re: [PATCH]: MEDIUM: pool little FreeBSD support improvement.
чт, 25 нояб. 2021 г. в 18:22, David CARLIER : > On Thu, 25 Nov 2021 at 12:25, Willy Tarreau wrote: > > > > On Thu, Nov 25, 2021 at 04:38:27PM +0500, ??? wrote: > > > > Thus I think that instead of focusing on the OS we ought to continue > > > > to focus on the allocator and improve runtime detection: > > > > > > > > - glibc (currently detected using detect_allocator) > > > > => use malloc_trim() > > > > - jemalloc at build time (mallctl != NULL) > > > > => use mallctl() as you did > > > > - jemalloc at runtime (mallctl == NULL but dlsym("mallctl") != > NULL) > > > > => use mallctl() as you did > > > > - others > > > > => no trimming > > > > > > > > > > I never imagined earlier that high level applications (such as reverse > > > https/tcp proxy) cares about such low level things as allocator > behaviour. > > > no jokes, really. > > > > Yes it does count a lot. That's also why we spent a lot of time > optimizing > > the pools, to limit the number of calls to the system's allocator for > > everything that uses a fixed size. I've seen some performance graphs in > > our internal ticket tracker showing the memory consumption between and > > after the switch to jemalloc, and the CPU usage as well, and sometimes > > it was very important. > > > > Glibc improved quite a bit recently (2.28 or 2.33 I don't remember) by > > implementing a per-thread cache in its ptmalloc. But in our case it's > > still not as good as jemalloc, and neither perform as well as our > > thread-local pools for fixed sizes. > > > > I'm seeing in a paper about snmalloc that it performs exceptionally well > > for small allocations. I just don't know how this degrades depending on > > the access patterns. For example some allocators are fast when you free() > > in the exact reverse allocation order, but can start to fragment or have > > more work to do finding holes if you don't free() in the exact same > order. > > > > If you re curious there is also mimalloc (with a pretty rich C api) > from Microsoft too. > > I had bad experience with tcmalloc on arm64. It turned out that it was not properly tested under arm64. actually, I think "do we really need massive alloc/free" instead of using preallocated objects. > > But that's something to keep an eye on in the future. > > > > Willy >
Re: [PATCH]: MEDIUM: pool little FreeBSD support improvement.
On Thu, 25 Nov 2021 at 12:25, Willy Tarreau wrote: > > On Thu, Nov 25, 2021 at 04:38:27PM +0500, ??? wrote: > > > Thus I think that instead of focusing on the OS we ought to continue > > > to focus on the allocator and improve runtime detection: > > > > > > - glibc (currently detected using detect_allocator) > > > => use malloc_trim() > > > - jemalloc at build time (mallctl != NULL) > > > => use mallctl() as you did > > > - jemalloc at runtime (mallctl == NULL but dlsym("mallctl") != NULL) > > > => use mallctl() as you did > > > - others > > > => no trimming > > > > > > > I never imagined earlier that high level applications (such as reverse > > https/tcp proxy) cares about such low level things as allocator behaviour. > > no jokes, really. > > Yes it does count a lot. That's also why we spent a lot of time optimizing > the pools, to limit the number of calls to the system's allocator for > everything that uses a fixed size. I've seen some performance graphs in > our internal ticket tracker showing the memory consumption between and > after the switch to jemalloc, and the CPU usage as well, and sometimes > it was very important. > > Glibc improved quite a bit recently (2.28 or 2.33 I don't remember) by > implementing a per-thread cache in its ptmalloc. But in our case it's > still not as good as jemalloc, and neither perform as well as our > thread-local pools for fixed sizes. > > I'm seeing in a paper about snmalloc that it performs exceptionally well > for small allocations. I just don't know how this degrades depending on > the access patterns. For example some allocators are fast when you free() > in the exact reverse allocation order, but can start to fragment or have > more work to do finding holes if you don't free() in the exact same order. > If you re curious there is also mimalloc (with a pretty rich C api) from Microsoft too. > But that's something to keep an eye on in the future. > > Willy
Re: [PATCH]: MEDIUM: pool little FreeBSD support improvement.
Here a patchset instead :) On Thu, 25 Nov 2021 at 12:25, Willy Tarreau wrote: > > On Thu, Nov 25, 2021 at 04:38:27PM +0500, ??? wrote: > > > Thus I think that instead of focusing on the OS we ought to continue > > > to focus on the allocator and improve runtime detection: > > > > > > - glibc (currently detected using detect_allocator) > > > => use malloc_trim() > > > - jemalloc at build time (mallctl != NULL) > > > => use mallctl() as you did > > > - jemalloc at runtime (mallctl == NULL but dlsym("mallctl") != NULL) > > > => use mallctl() as you did > > > - others > > > => no trimming > > > > > > > I never imagined earlier that high level applications (such as reverse > > https/tcp proxy) cares about such low level things as allocator behaviour. > > no jokes, really. > > Yes it does count a lot. That's also why we spent a lot of time optimizing > the pools, to limit the number of calls to the system's allocator for > everything that uses a fixed size. I've seen some performance graphs in > our internal ticket tracker showing the memory consumption between and > after the switch to jemalloc, and the CPU usage as well, and sometimes > it was very important. > > Glibc improved quite a bit recently (2.28 or 2.33 I don't remember) by > implementing a per-thread cache in its ptmalloc. But in our case it's > still not as good as jemalloc, and neither perform as well as our > thread-local pools for fixed sizes. > > I'm seeing in a paper about snmalloc that it performs exceptionally well > for small allocations. I just don't know how this degrades depending on > the access patterns. For example some allocators are fast when you free() > in the exact reverse allocation order, but can start to fragment or have > more work to do finding holes if you don't free() in the exact same order. > > But that's something to keep an eye on in the future. > > Willy From 7206aff8ce6b6cb8c9b9b36d33b4f218f33cf0d1 Mon Sep 17 00:00:00 2001 From: David Carlier Date: Thu, 25 Nov 2021 13:14:37 + Subject: [PATCH 3/3] MEDIUM: pool: using jemalloc when relevant. Finally, we trim the arenas with jemalloc if detected otherwise falling back to glibc's malloc_trim. --- src/pool.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/src/pool.c b/src/pool.c index 65c037911..3ec7a5386 100644 --- a/src/pool.c +++ b/src/pool.c @@ -48,10 +48,24 @@ static int(*my_mallctl)(const char *, void *, size_t *, void *, size_t) = NULL; /* ask the allocator to trim memory pools */ static void trim_all_pools(void) { + if (using_libc_allocator) { + if (my_mallctl) { + unsigned int i, narenas = 0; + size_t len = sizeof(narenas); + + if (my_mallctl("arenas.narenas", &narenas, &len, NULL, 0) == 0) { +for (i = 0; i < narenas; i ++) { + char mib[32] = {0}; + snprintf(mib, sizeof(mib), "arena.%u.purge", i); + (void)my_mallctl(mib, NULL, NULL, NULL, 0); +} + } + } else { #ifdef HAVE_MALLOC_TRIM - if (using_libc_allocator) - malloc_trim(0); + malloc_trim(0); #endif + } + } } /* check if we're using the same allocator as the one that provides -- 2.33.1 From e8daa477b53a43ab39113cf0e9c43d9bbda1e9a9 Mon Sep 17 00:00:00 2001 From: David Carlier Date: Thu, 25 Nov 2021 10:26:50 + Subject: [PATCH 1/3] MINOR: pool: refactoring to make it available for other systems/allocators. Actually it's focusing on linux/glibc, we re trying to expand it for other possible systems and allocators combinations. --- include/haproxy/compat.h | 9 +++-- src/pool.c | 38 +++--- 2 files changed, 14 insertions(+), 33 deletions(-) diff --git a/include/haproxy/compat.h b/include/haproxy/compat.h index 25b15a1f0..b801dac08 100644 --- a/include/haproxy/compat.h +++ b/include/haproxy/compat.h @@ -263,9 +263,14 @@ typedef struct { } empty_t; #endif /* glibc 2.33 provides mallinfo2() that overcomes mallinfo()'s type limitations */ -#if (defined(__GNU_LIBRARY__) && (__GLIBC__ > 2 || __GLIBC__ == 2 && __GLIBC_MINOR__ >= 33)) +#if defined(__GNU_LIBRARY__) #include -#define HA_HAVE_MALLINFO2 +#if (__GLIBC__ > 2 || __GLIBC__ == 2 && __GLIBC_MINOR__ >= 33) +#define HA_MALLINFO_API mallinfo2 +#else +#define HA_MALLINFO_API mallinfo +#endif +#define HA_HAVE_MALLINFO #endif /* FreeBSD also has malloc_usable_size() but it requires malloc_np.h */ diff --git a/src/pool.c b/src/pool.c index af46b4469..16ef76cf0 100644 --- a/src/pool.c +++ b/src/pool.c @@ -42,14 +42,15 @@ int mem_poison_byte = -1; static int mem_fail_rate = 0; #endif -#if defined(HA_HAVE_MALLOC_TRIM) static int using_libc_allocator = 0; /* ask the allocator to trim memory pools */ static void trim_all_pools(void) { +#ifdef HAVE_MALLOC_TRIM if (using_libc_allocator) malloc_trim(0); +#endif } /* check if we're using the same allocator as the one that provides @@ -60,48 +61,23 @@ static void trim_all_pools(void) */ static void detect_allocator(vo
Re: [PATCH]: MEDIUM: pool little FreeBSD support improvement.
On Thu, Nov 25, 2021 at 04:38:27PM +0500, ??? wrote: > > Thus I think that instead of focusing on the OS we ought to continue > > to focus on the allocator and improve runtime detection: > > > > - glibc (currently detected using detect_allocator) > > => use malloc_trim() > > - jemalloc at build time (mallctl != NULL) > > => use mallctl() as you did > > - jemalloc at runtime (mallctl == NULL but dlsym("mallctl") != NULL) > > => use mallctl() as you did > > - others > > => no trimming > > > > I never imagined earlier that high level applications (such as reverse > https/tcp proxy) cares about such low level things as allocator behaviour. > no jokes, really. Yes it does count a lot. That's also why we spent a lot of time optimizing the pools, to limit the number of calls to the system's allocator for everything that uses a fixed size. I've seen some performance graphs in our internal ticket tracker showing the memory consumption between and after the switch to jemalloc, and the CPU usage as well, and sometimes it was very important. Glibc improved quite a bit recently (2.28 or 2.33 I don't remember) by implementing a per-thread cache in its ptmalloc. But in our case it's still not as good as jemalloc, and neither perform as well as our thread-local pools for fixed sizes. I'm seeing in a paper about snmalloc that it performs exceptionally well for small allocations. I just don't know how this degrades depending on the access patterns. For example some allocators are fast when you free() in the exact reverse allocation order, but can start to fragment or have more work to do finding holes if you don't free() in the exact same order. But that's something to keep an eye on in the future. Willy
Re: [PATCH]: MEDIUM: pool little FreeBSD support improvement.
On Thu, 25 Nov 2021 at 11:38, Илья Шипицин wrote: > > > > чт, 25 нояб. 2021 г. в 14:54, Willy Tarreau : >> >> Hi David, >> >> On Wed, Nov 24, 2021 at 08:08:39PM +, David CARLIER wrote: >> > Hi >> > >> > here a little patch for FreeBSD to support memory arenas trimming. >> (...) >> > FreeBSD uses a slighty simplified version of jemalloc as libc allocator >> > since many years (there is thoughts to eventually switch to snmalloc >> > but not before a long time). >> > We detect the libc in the least hacky way in this case aiming as jemalloc >> > specific API then we try to purge arenas as much as we can. >> >> This one is interesting because according to the jemalloc doc on >> my machine it could also work on linux with jemalloc, provided that >> jemalloc is linked at build time (otherwise mallctl remains NULL). >> However it remains available uding dlopen(). >> >> Thus I think that instead of focusing on the OS we ought to continue >> to focus on the allocator and improve runtime detection: >> >> - glibc (currently detected using detect_allocator) >> => use malloc_trim() >> - jemalloc at build time (mallctl != NULL) >> => use mallctl() as you did >> - jemalloc at runtime (mallctl == NULL but dlsym("mallctl") != NULL) >> => use mallctl() as you did >> - others >> => no trimming > Fair enough I thought it was solely about system libc, but indeed the Linux/jemalloc combination is very common, interesting to look at indeed. Ok will provide the patchsets later on :-) > > I never imagined earlier that high level applications (such as reverse > https/tcp proxy) cares about such low level things as allocator behaviour. > no jokes, really. > This is how you stand out ;-) > >> >> >> That would cover the vast majority of use cases where trimming is relevant. >> I'm quite interested in jemalloc on linux, including when used at runtime >> using LD_PRELOAD, because I see it quite frequently in high-performance >> environments. When site operators see their memory usage fall two-fold and >> the CPU as well by just prepending an LD_PRELOAD in their startup script, >> they don't need to think twice and it's quickly done. >> >> Thus I think we could proceed like this in the pools init code: >> >>void detect_allocator() >>{ >> extern int mallctl(const char *, void *, size_t *, void *, size_t) >> __attribute__((weak)); >> >> my_mallctl = mallctl; >> if (!my_mallctl) >> my_mallctl = get_sym_curr_addr("mallctl"); >> >> if (!my_mallctl) { >> /* existing tests based on mallinfo/mallinfo2 */ >> } >>} >> >> At this point we'd have: >> - my_mallctl not null when using jemalloc, hence trim() would always >> use your code that purges all arenas >> >> - otherwise if using_libc_allocator is set, we can call malloc_trim() >> >> - otherwise it's an unknown allocator and we don't do anything. >> >> Thus could you please turn your patch into a small series that does >> this ? The steps I'm seeing are: >> 1. changing the detection logic so that it's always performed, and >> not just on HA_HAVE_MALLOC_TRIM. This means the variants of >> functions is_trim_enabled, trim_all_pools and detect_allocator() >> are all remerged into a single variant. trim_all_pools() will be >> the only one to rely on HA_HAVE_MALLOC_TRIM. >> >> 2. introduce mallctl availability detection and put its pointer into >> a separate one that we can get unsing dlfcn when linked in. >> >> 3. modify trim_all_pools() to use mallctl() when available, otherwise >> rely on current code with malloc_trim(). >> >> This should make a significant improvement with plug-n-play detection >> for the majority of users. >> >> Then if you want to introduce runtime detection for snmalloc to do the >> same, this would make everything much easier. >> >> Thanks, >> Willy >>
Re: [PATCH]: MEDIUM: pool little FreeBSD support improvement.
чт, 25 нояб. 2021 г. в 14:54, Willy Tarreau : > Hi David, > > On Wed, Nov 24, 2021 at 08:08:39PM +, David CARLIER wrote: > > Hi > > > > here a little patch for FreeBSD to support memory arenas trimming. > (...) > > FreeBSD uses a slighty simplified version of jemalloc as libc allocator > > since many years (there is thoughts to eventually switch to snmalloc > > but not before a long time). > > We detect the libc in the least hacky way in this case aiming as jemalloc > > specific API then we try to purge arenas as much as we can. > > This one is interesting because according to the jemalloc doc on > my machine it could also work on linux with jemalloc, provided that > jemalloc is linked at build time (otherwise mallctl remains NULL). > However it remains available uding dlopen(). > > Thus I think that instead of focusing on the OS we ought to continue > to focus on the allocator and improve runtime detection: > > - glibc (currently detected using detect_allocator) > => use malloc_trim() > - jemalloc at build time (mallctl != NULL) > => use mallctl() as you did > - jemalloc at runtime (mallctl == NULL but dlsym("mallctl") != NULL) > => use mallctl() as you did > - others > => no trimming > I never imagined earlier that high level applications (such as reverse https/tcp proxy) cares about such low level things as allocator behaviour. no jokes, really. > > That would cover the vast majority of use cases where trimming is relevant. > I'm quite interested in jemalloc on linux, including when used at runtime > using LD_PRELOAD, because I see it quite frequently in high-performance > environments. When site operators see their memory usage fall two-fold and > the CPU as well by just prepending an LD_PRELOAD in their startup script, > they don't need to think twice and it's quickly done. > > Thus I think we could proceed like this in the pools init code: > >void detect_allocator() >{ > extern int mallctl(const char *, void *, size_t *, void *, size_t) > __attribute__((weak)); > > my_mallctl = mallctl; > if (!my_mallctl) > my_mallctl = get_sym_curr_addr("mallctl"); > > if (!my_mallctl) { > /* existing tests based on mallinfo/mallinfo2 */ > } >} > > At this point we'd have: > - my_mallctl not null when using jemalloc, hence trim() would always > use your code that purges all arenas > > - otherwise if using_libc_allocator is set, we can call malloc_trim() > > - otherwise it's an unknown allocator and we don't do anything. > > Thus could you please turn your patch into a small series that does > this ? The steps I'm seeing are: > 1. changing the detection logic so that it's always performed, and > not just on HA_HAVE_MALLOC_TRIM. This means the variants of > functions is_trim_enabled, trim_all_pools and detect_allocator() > are all remerged into a single variant. trim_all_pools() will be > the only one to rely on HA_HAVE_MALLOC_TRIM. > > 2. introduce mallctl availability detection and put its pointer into > a separate one that we can get unsing dlfcn when linked in. > > 3. modify trim_all_pools() to use mallctl() when available, otherwise > rely on current code with malloc_trim(). > > This should make a significant improvement with plug-n-play detection > for the majority of users. > > Then if you want to introduce runtime detection for snmalloc to do the > same, this would make everything much easier. > > Thanks, > Willy > >
Re: [ANNOUNCE] haproxy-2.4.9
On Thu, Nov 25, 2021 at 01:29:13PM +0300, Dmitry Sivachenko wrote: > > > On 25 Nov 2021, at 13:09, Willy Tarreau wrote: > > > > Please try the two attached patches. They re-backport something that > > we earlier failed to backport that simplifies the ugly ifdefs everywhere > > that virtually break every single backport related to SSL. > > > > For me they work with/without SSL and with older versions (tested as far > > as 0.9.8). > > > > Thanks, > > Willy > > <0001-CLEANUP-servers-do-not-include-openssl-compat.patch><0002-CLEANUP-server-always-include-the-storage-for-SSL-se.patch> > > > These two patches do fix the build. OK thanks Dmitry. For now we'll probably just keep the workaround that Amaury pushed in the mean time, but I'm pretty sure that sooner or later we'll see yet another breakage, and if so it's likely that we decide to merge them. Cheers, Willy
Re: [ANNOUNCE] haproxy-2.4.9
> On 25 Nov 2021, at 13:29, Amaury Denoyelle wrote: > > Dmitry, the patches that Willy provided you should fix the issue. Now, > do you need a 2.4.10 to be emitted early with it or is it possible for > you to keep the patches in your tree so we can have a more substantial > list of change for a new version ? > As for me there is no hurry: I'll add patches to FreeBSD ports collection.
Re: [ANNOUNCE] haproxy-2.4.9
On Thu, Nov 25, 2021 at 11:42:01AM +0300, Dmitry Sivachenko wrote: > On 24 Nov 2021, at 12:57, Christopher Faulet wrote: > > > > Hi, > > > HAProxy 2.4.9 was released on 2021/11/23. It added 36 new commits > > after version 2.4.8. > > > Hello, > version 2.4.9 fails to build with OpenSSL turned off: > src/server.c:207:51: error: no member named 'ssl_ctx' in 'struct server' > if (srv->mux_proto || srv->use_ssl != 1 || !srv->ssl_ctx.alpn_str) { > ~~~ ^ > src/server.c:241:37: error: no member named 'ssl_ctx' in 'struct server' > const struct ist alpn = ist2(srv->ssl_ctx.alpn_str, > ~~~ ^ > src/server.c:242:37: error: no member named 'ssl_ctx' in 'struct server' > srv->ssl_ctx.alpn_len); > ~~~ ^ > Version 2.4.8 builds fine. > > Dmitry, the patches that Willy provided you should fix the issue. Now, do you need a 2.4.10 to be emitted early with it or is it possible for you to keep the patches in your tree so we can have a more substantial list of change for a new version ? -- Amaury Denoyelle
Re: [ANNOUNCE] haproxy-2.4.9
> On 25 Nov 2021, at 13:09, Willy Tarreau wrote: > > Please try the two attached patches. They re-backport something that > we earlier failed to backport that simplifies the ugly ifdefs everywhere > that virtually break every single backport related to SSL. > > For me they work with/without SSL and with older versions (tested as far > as 0.9.8). > > Thanks, > Willy > <0001-CLEANUP-servers-do-not-include-openssl-compat.patch><0002-CLEANUP-server-always-include-the-storage-for-SSL-se.patch> These two patches do fix the build. Thanks!
Re: [ANNOUNCE] haproxy-2.4.9
On Thu, Nov 25, 2021 at 11:02:52AM +0100, Amaury Denoyelle wrote: > On Thu, Nov 25, 2021 at 11:42:01AM +0300, Dmitry Sivachenko wrote: > > On 24 Nov 2021, at 12:57, Christopher Faulet wrote: > > > > > Hi, > > > > HAProxy 2.4.9 was released on 2021/11/23. It added 36 new commits > > > after version 2.4.8. > > > > > Hello, > > version 2.4.9 fails to build with OpenSSL turned off: > > src/server.c:207:51: error: no member named 'ssl_ctx' in 'struct server' > > if (srv->mux_proto || srv->use_ssl != 1 || !srv->ssl_ctx.alpn_str) { > > ~~~ ^ > > src/server.c:241:37: error: no member named 'ssl_ctx' in 'struct server' > > const struct ist alpn = ist2(srv->ssl_ctx.alpn_str, > > ~~~ ^ > > src/server.c:242:37: error: no member named 'ssl_ctx' in 'struct server' > > srv->ssl_ctx.alpn_len); > > ~~~ ^ > > Version 2.4.8 builds fine. > > > > > > Thanks for your report. One of my commit to handle properly websocket on > the server side introduces this issue. I'm working on a fix. Please try the two attached patches. They re-backport something that we earlier failed to backport that simplifies the ugly ifdefs everywhere that virtually break every single backport related to SSL. For me they work with/without SSL and with older versions (tested as far as 0.9.8). Thanks, Willy >From ce5ca630697a069ffbd81169663e5dbeb554179a Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Wed, 6 Oct 2021 11:23:32 +0200 Subject: CLEANUP: servers: do not include openssl-compat This is exactly the same as for listeners, servers only include openssl-compat to provide the SSL_CTX type to use as two pointers to contexts, and to detect if NPN, ALPN, and cipher suites are supported, and save up to 5 pointers in the ssl_ctx struct if not supported. This is pointless, as these ones have all been supported for about a decade, and including this file comes with a long dependency chain that impacts lots of other files. The ctx was made a void*. Now the build time was significantly reduced, from 9.2 to 8.1 seconds, thanks to opensslconf.h being included "only" 456 times instead of 2424 previously! The total number of lines of code compiled was reduced by 15%. (cherry picked from commit 340ef2502eae2a37781e460d3590982c0e437fbd) [wt: this is backported to get rid of the painful #ifdef around SSL fields that regularly break backports] Signed-off-by: Willy Tarreau --- include/haproxy/server-t.h | 10 +- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/include/haproxy/server-t.h b/include/haproxy/server-t.h index 429195388..32b649bf3 100644 --- a/include/haproxy/server-t.h +++ b/include/haproxy/server-t.h @@ -35,9 +35,7 @@ #include #include #include -#include #include -#include #include #include #include @@ -341,7 +339,7 @@ struct server { #ifdef USE_OPENSSL char *sni_expr; /* Temporary variable to store a sample expression for SNI */ struct { - SSL_CTX *ctx; + void *ctx; struct { unsigned char *ptr; int size; @@ -353,9 +351,7 @@ struct server { __decl_thread(HA_RWLOCK_T lock); /* lock the cache and SSL_CTX during commit operations */ char *ciphers; /* cipher suite to use if non-null */ -#ifdef HAVE_SSL_CTX_SET_CIPHERSUITES char *ciphersuites; /* TLS 1.3 cipher suite to use if non-null */ -#endif int options;/* ssl options */ int verify; /* verify method (set of SSL_VERIFY_* flags) */ struct tls_version_filter methods; /* ssl methods */ @@ -363,14 +359,10 @@ struct server { char *ca_file; /* CAfile to use on verify */ char *crl_file; /* CRLfile to use on verify */ struct sample_expr *sni;/* sample expression for SNI */ -#ifdef OPENSSL_NPN_NEGOTIATED char *npn_str; /* NPN protocol string */ int npn_len;/* NPN protocol string length */ -#endif -#ifdef TLSEXT_TYPE_application_layer_protocol_negotiation char *alpn_str; /* ALPN protocol string */ int alpn_len; /* ALPN protocol string length */ -#endif } ssl_ctx; #ifdef USE_QUIC struct quic_transport_params quic_params; /* QUIC transport parameters */ -- 2.28.0 >From 6d395b766fd816cf2e7feea3286a689e635e35f9 Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Wed, 6 Oct 2021 14:48:37 +0200 Subject: CLEANUP: server: always include the storage for SSL settings The SSL stuff in struct server takes less than 3% of it a
Re: [ANNOUNCE] haproxy-2.4.9
On Thu, Nov 25, 2021 at 11:42:01AM +0300, Dmitry Sivachenko wrote: > On 24 Nov 2021, at 12:57, Christopher Faulet wrote: > > > > Hi, > > > HAProxy 2.4.9 was released on 2021/11/23. It added 36 new commits > > after version 2.4.8. > > > Hello, > version 2.4.9 fails to build with OpenSSL turned off: > src/server.c:207:51: error: no member named 'ssl_ctx' in 'struct server' > if (srv->mux_proto || srv->use_ssl != 1 || !srv->ssl_ctx.alpn_str) { > ~~~ ^ > src/server.c:241:37: error: no member named 'ssl_ctx' in 'struct server' > const struct ist alpn = ist2(srv->ssl_ctx.alpn_str, > ~~~ ^ > src/server.c:242:37: error: no member named 'ssl_ctx' in 'struct server' > srv->ssl_ctx.alpn_len); > ~~~ ^ > Version 2.4.8 builds fine. > > Thanks for your report. One of my commit to handle properly websocket on the server side introduces this issue. I'm working on a fix. -- Amaury Denoyelle
Re: [PATCH]: MEDIUM: pool little FreeBSD support improvement.
Hi David, On Wed, Nov 24, 2021 at 08:08:39PM +, David CARLIER wrote: > Hi > > here a little patch for FreeBSD to support memory arenas trimming. (...) > FreeBSD uses a slighty simplified version of jemalloc as libc allocator > since many years (there is thoughts to eventually switch to snmalloc > but not before a long time). > We detect the libc in the least hacky way in this case aiming as jemalloc > specific API then we try to purge arenas as much as we can. This one is interesting because according to the jemalloc doc on my machine it could also work on linux with jemalloc, provided that jemalloc is linked at build time (otherwise mallctl remains NULL). However it remains available uding dlopen(). Thus I think that instead of focusing on the OS we ought to continue to focus on the allocator and improve runtime detection: - glibc (currently detected using detect_allocator) => use malloc_trim() - jemalloc at build time (mallctl != NULL) => use mallctl() as you did - jemalloc at runtime (mallctl == NULL but dlsym("mallctl") != NULL) => use mallctl() as you did - others => no trimming That would cover the vast majority of use cases where trimming is relevant. I'm quite interested in jemalloc on linux, including when used at runtime using LD_PRELOAD, because I see it quite frequently in high-performance environments. When site operators see their memory usage fall two-fold and the CPU as well by just prepending an LD_PRELOAD in their startup script, they don't need to think twice and it's quickly done. Thus I think we could proceed like this in the pools init code: void detect_allocator() { extern int mallctl(const char *, void *, size_t *, void *, size_t) __attribute__((weak)); my_mallctl = mallctl; if (!my_mallctl) my_mallctl = get_sym_curr_addr("mallctl"); if (!my_mallctl) { /* existing tests based on mallinfo/mallinfo2 */ } } At this point we'd have: - my_mallctl not null when using jemalloc, hence trim() would always use your code that purges all arenas - otherwise if using_libc_allocator is set, we can call malloc_trim() - otherwise it's an unknown allocator and we don't do anything. Thus could you please turn your patch into a small series that does this ? The steps I'm seeing are: 1. changing the detection logic so that it's always performed, and not just on HA_HAVE_MALLOC_TRIM. This means the variants of functions is_trim_enabled, trim_all_pools and detect_allocator() are all remerged into a single variant. trim_all_pools() will be the only one to rely on HA_HAVE_MALLOC_TRIM. 2. introduce mallctl availability detection and put its pointer into a separate one that we can get unsing dlfcn when linked in. 3. modify trim_all_pools() to use mallctl() when available, otherwise rely on current code with malloc_trim(). This should make a significant improvement with plug-n-play detection for the majority of users. Then if you want to introduce runtime detection for snmalloc to do the same, this would make everything much easier. Thanks, Willy
Re: [ANNOUNCE] haproxy-2.4.9
On 24 Nov 2021, at 12:57, Christopher Faulet wrote: > > > Hi, > > HAProxy 2.4.9 was released on 2021/11/23. It added 36 new commits > after version 2.4.8. > Hello, version 2.4.9 fails to build with OpenSSL turned off: src/server.c:207:51: error: no member named 'ssl_ctx' in 'struct server' if (srv->mux_proto || srv->use_ssl != 1 || !srv->ssl_ctx.alpn_str) { ~~~ ^ src/server.c:241:37: error: no member named 'ssl_ctx' in 'struct server' const struct ist alpn = ist2(srv->ssl_ctx.alpn_str, ~~~ ^ src/server.c:242:37: error: no member named 'ssl_ctx' in 'struct server' srv->ssl_ctx.alpn_len); ~~~ ^ Version 2.4.8 builds fine.
Re: [PATCH 1/1] BUG/MINOR: lua: remove loop initial declarations
Le 11/24/21 à 22:16, Bertrand Jacquin a écrit : HAProxy is documented to support gcc >= 3.4 as per INSTALL file, however hlua.c makes use of c11 only loop initial declarations leading to build failure when using gcc-4.9.4: x86_64-unknown-linux-gnu-gcc -Iinclude -Wchar-subscripts -Wcomment -Wformat -Winit-self -Wmain -Wmissing-braces -Wno-pragmas -Wparentheses -Wreturn-type -Wsequence-point -Wstrict-aliasing -Wswitch -Wtrigraphs -Wuninitialized -Wunknown-pragmas -Wunused-label -Wunused-variable -Wunused-value -Wpointer-sign -Wimplicit -pthread -fdiagnostics-color=auto -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -O3 -msse -mfpmath=sse -march=core2 -g -fPIC -g -Wall -Wextra -Wundef -Wdeclaration-after-statement -fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wtype-limits -DUSE_EPOLL -DUSE_NETFILTER -DUSE_PCRE2 -DUSE_PCRE2_JIT -DUSE_POLL -DUSE_THREAD -DUSE_BACKTRACE -DUSE_TPROXY -DUSE_LINUX_TPROXY -DUSE_LINUX_SPLICE -DUSE_LIBCRYPT -DUSE_CRYPT_H -DUSE_GETADDRINFO -DUSE_OPENSSL -DUSE_LUA -DUSE_ACCEPT4 -DUSE_SLZ -DUSE_CPU_AFFINITY -DUSE_TFO -DUSE_NS -DUSE_DL -DUSE_RT -DUSE_PRCTL -DUSE_THREAD_DUMP-DUSE_PCRE2 -DPCRE2_CODE_UNIT_WIDTH=8 -I/usr/local/include -DCONFIG_HAPROXY_VERSION=\"2.5.0\" -DCONFIG_HAPROXY_DATE=\"2021/11/23\" -c -o src/connection.o src/connection.c src/hlua.c: In function 'hlua_config_prepend_path': src/hlua.c:11292:2: error: 'for' loop initial declarations are only allowed in C99 or C11 mode for (size_t i = 0; i < 2; i++) { ^ src/hlua.c:11292:2: note: use option -std=c99, -std=gnu99, -std=c11 or -std=gnu11 to compile your code This commit moves loop iterator to an explicit declaration. No backport needed as this issue was introduced in v2.5-dev10~69 with commit 9e5e586e35c5 ("BUG/MINOR: lua: Fix lua error handling in `hlua_config_prepend_path()`") --- src/hlua.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/hlua.c b/src/hlua.c index 08735374af77..8dea91e75832 100644 --- a/src/hlua.c +++ b/src/hlua.c @@ -11249,6 +11249,7 @@ static int hlua_config_prepend_path(char **args, int section_type, struct proxy char *path; char *type = "path"; struct prepend_path *p = NULL; + size_t i; if (too_many_args(2, args, err, NULL)) { goto err; @@ -11289,7 +11290,7 @@ static int hlua_config_prepend_path(char **args, int section_type, struct proxy * thread. The remaining threads will be initialized based on * prepend_path_list. */ - for (size_t i = 0; i < 2; i++) { + for (i = 0; i < 2; i++) { lua_State *L = hlua_states[i]; const char *error; Thanks, merged now with backports info updated. -- Christopher Faulet