чт, 25 нояб. 2021 г. в 18:22, David CARLIER <devne...@gmail.com>:
> On Thu, 25 Nov 2021 at 12:25, Willy Tarreau <w...@1wt.eu> wrote: > > > > On Thu, Nov 25, 2021 at 04:38:27PM +0500, ???? ??????? wrote: > > > > Thus I think that instead of focusing on the OS we ought to continue > > > > to focus on the allocator and improve runtime detection: > > > > > > > > - glibc (currently detected using detect_allocator) > > > > => use malloc_trim() > > > > - jemalloc at build time (mallctl != NULL) > > > > => use mallctl() as you did > > > > - jemalloc at runtime (mallctl == NULL but dlsym("mallctl") != > NULL) > > > > => use mallctl() as you did > > > > - others > > > > => no trimming > > > > > > > > > > I never imagined earlier that high level applications (such as reverse > > > https/tcp proxy) cares about such low level things as allocator > behaviour. > > > no jokes, really. > > > > Yes it does count a lot. That's also why we spent a lot of time > optimizing > > the pools, to limit the number of calls to the system's allocator for > > everything that uses a fixed size. I've seen some performance graphs in > > our internal ticket tracker showing the memory consumption between and > > after the switch to jemalloc, and the CPU usage as well, and sometimes > > it was very important. > > > > Glibc improved quite a bit recently (2.28 or 2.33 I don't remember) by > > implementing a per-thread cache in its ptmalloc. But in our case it's > > still not as good as jemalloc, and neither perform as well as our > > thread-local pools for fixed sizes. > > > > I'm seeing in a paper about snmalloc that it performs exceptionally well > > for small allocations. I just don't know how this degrades depending on > > the access patterns. For example some allocators are fast when you free() > > in the exact reverse allocation order, but can start to fragment or have > > more work to do finding holes if you don't free() in the exact same > order. > > > > If you re curious there is also mimalloc (with a pretty rich C api) > from Microsoft too. > > I had bad experience with tcmalloc on arm64. It turned out that it was not properly tested under arm64. actually, I think "do we really need massive alloc/free" instead of using preallocated objects. > > But that's something to keep an eye on in the future. > > > > Willy >