That's an excellent reading. It will be great to read more about technologies in DragonflyBSD. Something like http://www.openbsd.org/papers ? ;-)
On Fri, Oct 15, 2010 at 3:44 AM, Venkatesh Srinivas <m...@endeavour.zapto.org> wrote: > >>> What's a "magazine"? >>> >>> Pierre > > In libc, nmalloc (lib/libc/stdlib/nmalloc.c) provides malloc() ( from malus > locus, 'bad place' ) and free() for single-threaded and multithreaded > applications. In the DragonFly 2.4 release cycle, the original allocator > (phkmalloc, inherited from FreeBSD) was replaced with a port of the kernel > slab allocator; in the 2.8 release cycle I committed some work to change the > multithreaded strategy. > > The DragonFly 2.4 and 2.6 libc allocator had two strategies, one for small > requests, one for large requests; large requests are served directly via > mmap. Small requests are served from 64k, 64k-aligned regions of memory > called 'slabs'. Each slab only services requests of a given size, minimizing > fragmentation. (The DragonFly libc slab allocator was fairly different from > the original Sun design -- the Sun design has variable-size slabs and a hash > table for block->slab mappings). For multithreaded applications, the > allocator kept track of four sets of slab structures; threads would attempt > to use the set they'd used most recently. If they failed to lock that said, > they'd move on to the next set. > > The 2.7/2.8 allocator has a new structure -- a magazine. A magazine is a > fixed-size array of blocks of the same size. Each thread carries a pair of > magazines; when a thread tries to allocate something, it first checks its > magazines for a buffer _without any locking_. If the magazines are not able > to support an allocation, a central collection of magazines, called the > 'depot', is locked and a magazine is retrieved. If the depot is empty, we > fall back to the slab allocator. The design is also from Sun -- see a paper > called 'Magazines and Vmem: Extending the Slab Allocator to Many CPUs and > Multiple Resources'. When I last measured, in the 2.6 release cycle, the > magazine layer sped up sysbench OLTP / MySQL by approximately 20%. > > The 2.7/2.8 allocator also has work to reduce the number of mmap/munmap > system calls relative to the earlier version of the allocator; rather than > immediately unmapping a slab when it has no outstanding allocations, we keep > around up to 64 old slabs and we attempt to allocate slabs in bursts from > the system. When I last measured, the reduction in mmaps/munmaps was fairly > dramatic. > > The latest bug was a fairly unfortunate one -- most of the locks in nmalloc > use libc's spinlocks. The depot locks, however, used pthread_spinlocks; when > nmalloc was linked against libc, it used the stub pthread_spinlocks in libc, > rather than the versions in libthread. This meant that accesses to the depot > magazine lists were not synchronized at all and the magazine lists were > getting corrupted. Oops... > > -- vs > -- “If you’re good at something, never do it for free.” —The Joker