On Thu, 25 Nov 2021 at 11:38, Илья Шипицин <chipits...@gmail.com> wrote: > > > > чт, 25 нояб. 2021 г. в 14:54, Willy Tarreau <w...@1wt.eu>: >> >> Hi David, >> >> On Wed, Nov 24, 2021 at 08:08:39PM +0000, David CARLIER wrote: >> > Hi >> > >> > here a little patch for FreeBSD to support memory arenas trimming. >> (...) >> > FreeBSD uses a slighty simplified version of jemalloc as libc allocator >> > since many years (there is thoughts to eventually switch to snmalloc >> > but not before a long time). >> > We detect the libc in the least hacky way in this case aiming as jemalloc >> > specific API then we try to purge arenas as much as we can. >> >> This one is interesting because according to the jemalloc doc on >> my machine it could also work on linux with jemalloc, provided that >> jemalloc is linked at build time (otherwise mallctl remains NULL). >> However it remains available uding dlopen(). >> >> Thus I think that instead of focusing on the OS we ought to continue >> to focus on the allocator and improve runtime detection: >> >> - glibc (currently detected using detect_allocator) >> => use malloc_trim() >> - jemalloc at build time (mallctl != NULL) >> => use mallctl() as you did >> - jemalloc at runtime (mallctl == NULL but dlsym("mallctl") != NULL) >> => use mallctl() as you did >> - others >> => no trimming >
Fair enough I thought it was solely about system libc, but indeed the Linux/jemalloc combination is very common, interesting to look at indeed. Ok will provide the patchsets later on :-) > > I never imagined earlier that high level applications (such as reverse > https/tcp proxy) cares about such low level things as allocator behaviour. > no jokes, really. > <joke>This is how you stand out ;-)</joke> > >> >> >> That would cover the vast majority of use cases where trimming is relevant. >> I'm quite interested in jemalloc on linux, including when used at runtime >> using LD_PRELOAD, because I see it quite frequently in high-performance >> environments. When site operators see their memory usage fall two-fold and >> the CPU as well by just prepending an LD_PRELOAD in their startup script, >> they don't need to think twice and it's quickly done. >> >> Thus I think we could proceed like this in the pools init code: >> >> void detect_allocator() >> { >> extern int mallctl(const char *, void *, size_t *, void *, size_t) >> __attribute__((weak)); >> >> my_mallctl = mallctl; >> if (!my_mallctl) >> my_mallctl = get_sym_curr_addr("mallctl"); >> >> if (!my_mallctl) { >> /* existing tests based on mallinfo/mallinfo2 */ >> } >> } >> >> At this point we'd have: >> - my_mallctl not null when using jemalloc, hence trim() would always >> use your code that purges all arenas >> >> - otherwise if using_libc_allocator is set, we can call malloc_trim() >> >> - otherwise it's an unknown allocator and we don't do anything. >> >> Thus could you please turn your patch into a small series that does >> this ? The steps I'm seeing are: >> 1. changing the detection logic so that it's always performed, and >> not just on HA_HAVE_MALLOC_TRIM. This means the variants of >> functions is_trim_enabled, trim_all_pools and detect_allocator() >> are all remerged into a single variant. trim_all_pools() will be >> the only one to rely on HA_HAVE_MALLOC_TRIM. >> >> 2. introduce mallctl availability detection and put its pointer into >> a separate one that we can get unsing dlfcn when linked in. >> >> 3. modify trim_all_pools() to use mallctl() when available, otherwise >> rely on current code with malloc_trim(). >> >> This should make a significant improvement with plug-n-play detection >> for the majority of users. >> >> Then if you want to introduce runtime detection for snmalloc to do the >> same, this would make everything much easier. >> >> Thanks, >> Willy >>