Hi David,

On Wed, Nov 24, 2021 at 08:08:39PM +0000, David CARLIER wrote:
> Hi
> 
> here a little patch for FreeBSD to support memory arenas trimming.
(...)
> FreeBSD uses a slighty simplified version of jemalloc as libc allocator
> since many years (there is thoughts to eventually switch to snmalloc
>  but not before a long time).
> We detect the libc in the least hacky way in this case aiming as jemalloc
>  specific API then we try to purge arenas as much as we can.

This one is interesting because according to the jemalloc doc on
my machine it could also work on linux with jemalloc, provided that
jemalloc is linked at build time (otherwise mallctl remains NULL).
However it remains available uding dlopen().

Thus I think that instead of focusing on the OS we ought to continue
to focus on the allocator and improve runtime detection:

  - glibc (currently detected using detect_allocator)
    => use malloc_trim()
  - jemalloc at build time (mallctl != NULL)
    => use mallctl() as you did
  - jemalloc at runtime (mallctl == NULL but dlsym("mallctl") != NULL)
    => use mallctl() as you did
  - others
    => no trimming

That would cover the vast majority of use cases where trimming is relevant.
I'm quite interested in jemalloc on linux, including when used at runtime
using LD_PRELOAD, because I see it quite frequently in high-performance
environments. When site operators see their memory usage fall two-fold and
the CPU as well by just prepending an LD_PRELOAD in their startup script,
they don't need to think twice and it's quickly done.

Thus I think we could proceed like this in the pools init code:

   void detect_allocator()
   {
        extern int mallctl(const char *, void *, size_t *, void *, size_t) 
__attribute__((weak));

        my_mallctl = mallctl;
        if (!my_mallctl)
                my_mallctl = get_sym_curr_addr("mallctl");

        if (!my_mallctl) {
                /* existing tests based on mallinfo/mallinfo2 */
        }
   }

At this point we'd have:
  - my_mallctl not null when using jemalloc, hence trim() would always
    use your code that purges all arenas

  - otherwise if using_libc_allocator is set, we can call malloc_trim()

  - otherwise it's an unknown allocator and we don't do anything.

Thus could you please turn your patch into a small series that does
this ? The steps I'm seeing are:
  1. changing the detection logic so that it's always performed, and
     not just on HA_HAVE_MALLOC_TRIM. This means the variants of
     functions is_trim_enabled, trim_all_pools and detect_allocator()
     are all remerged into a single variant. trim_all_pools() will be
     the only one to rely on HA_HAVE_MALLOC_TRIM.

  2. introduce mallctl availability detection and put its pointer into
     a separate one that we can get unsing dlfcn when linked in.

  3. modify trim_all_pools() to use mallctl() when available, otherwise
     rely on current code with malloc_trim().

This should make a significant improvement with plug-n-play detection
for the majority of users.

Then if you want to introduce runtime detection for snmalloc to do the
same, this would make everything much easier.

Thanks,
Willy

Reply via email to