Hi, On 2016-12-13 01:45:13 +0100, Tomas Vondra wrote: > src/backend/utils/mmgr/Makefile | 2 +- > src/backend/utils/mmgr/aset.c | 115 +-------------------------------- > src/backend/utils/mmgr/memdebug.c | 131 > ++++++++++++++++++++++++++++++++++++++ > src/include/utils/memdebug.h | 22 +++++++ > 4 files changed, 156 insertions(+), 114 deletions(-) > create mode 100644 src/backend/utils/mmgr/memdebug.c
I'm a bit loathe to move these to a .c file - won't this likely make these debugging tools even slower? Seems better to put some of them them in a header as static inlines (not randomize, but the rest). > From 43aaabf70b979b172fd659ef4d0ef129fd78d72d Mon Sep 17 00:00:00 2001 > From: Tomas Vondra <to...@2ndquadrant.com> > Date: Wed, 30 Nov 2016 15:36:23 +0100 > Subject: [PATCH 2/3] slab allocator > > --- > src/backend/replication/logical/reorderbuffer.c | 82 +-- > src/backend/utils/mmgr/Makefile | 2 +- > src/backend/utils/mmgr/slab.c | 803 > ++++++++++++++++++++++++ > src/include/nodes/memnodes.h | 2 +- > src/include/nodes/nodes.h | 1 + > src/include/replication/reorderbuffer.h | 15 +- > src/include/utils/memutils.h | 9 + I'd like to see the reorderbuffer changes split into a separate commit from the slab allocator introduction. > +/*------------------------------------------------------------------------- > + * > + * slab.c > + * SLAB allocator definitions. > + * > + * SLAB is a custom MemoryContext implementation designed for cases of > + * equally-sized objects. > + * > + * > + * Portions Copyright (c) 2016, PostgreSQL Global Development Group Bump, before a committer forgets it. > + * IDENTIFICATION > + * src/backend/utils/mmgr/slab.c > + * > + * > + * The constant allocation size allows significant simplification and > various > + * optimizations. Firstly, we can get rid of the doubling and carve the > blocks > + * into chunks of exactly the right size (plus alignment), not wasting > memory. Getting rid of it relative to what? I'd try to phrase it so these comments stand on their own. > + * Each block includes a simple bitmap tracking which chunks are used/free. > + * This makes it trivial to check if all chunks on the block are free, and > + * eventually free the whole block (which is almost impossible with a > global > + * freelist of chunks, storing chunks from all blocks). Why is checking a potentially somewhat long-ish bitmap better than a simple counter, or a "linked list" of "next free chunk-number" or such (where free chunks simply contain the id of the subsequent chunk)? Using a list instead of a bitmap would also make it possible to get 'lifo' behaviour, which is good for cache efficiency. A simple chunk-number based singly linked list would only imply a minimum allocation size of 4 - that seems perfectly reasonable? > + * At the context level, we use 'freelist' to track blocks ordered by > number > + * of free chunks, starting with blocks having a single allocated chunk, > and > + * with completely full blocks on the tail. Why that way round? Filling chunks up as much as possible is good for cache and TLB efficiency, and allows for earlier de-allocation of partially used blocks? Oh, I see you do that in the next comment, but it still leaves me wondering. Also, is this actually a list? It's more an array of lists, right? I.e. it should be named freelists? Thirdly, isn't that approach going to result in a quite long freelists array, when you have small items and a decent blocksize? That seems like a fairly reasonable thing to do? > + * This also allows various optimizations - for example when searching for > + * free chunk, we the allocator reuses space from the most full blocks > first, > + * in the hope that some of the less full blocks will get completely empty > + * (and returned back to the OS). Might be worth mentioning tlb/cache efficiency too. > + * For each block, we maintain pointer to the first free chunk - this is > quite > + * cheap and allows us to skip all the preceding used chunks, eliminating > + * a significant number of lookups in many common usage patters. In the > worst > + * case this performs as if the pointer was not maintained. Hm, so that'd be eliminated if we maintained a linked list of chunks (by chunk number) and a free_chunk_cnt or such. > + > +#include "postgres.h" > + > +#include "utils/memdebug.h" > +#include "utils/memutils.h" > +#include "lib/ilist.h" Move ilist up, above memdebug, so the list is alphabetically ordered. > +/* > + * SlabPointer > + * Aligned pointer which may be a member of an allocation set. > + */ > +typedef void *SlabPointer; > +typedef SlabContext *Slab; I personally wont commit this whith pointer hiding typedefs. If somebody else does, I can live with it, but for me it's bad enough taste that I wont. > +/* > + * SlabContext is a specialized implementation of MemoryContext. > + */ > +typedef struct SlabContext > +{ > + MemoryContextData header; /* Standard memory-context fields */ > + /* Allocation parameters for this context: */ > + Size chunkSize; /* chunk size */ > + Size fullChunkSize; /* chunk size including header and > alignment */ > + Size blockSize; /* block size */ > + int chunksPerBlock; /* number of chunks per block */ > + int minFreeChunks; /* min number of free chunks in > any block */ > + int nblocks; /* number of blocks > allocated */ > + /* blocks with free space, grouped by number of free chunks: */ > + dlist_head freelist[FLEXIBLE_ARRAY_MEMBER]; > +} SlabContext; > + Why aren't these ints something unsigned? > +/* > + * SlabIsValid > + * True iff set is valid allocation set. > + */ > +#define SlabIsValid(set) PointerIsValid(set) It's not your fault, but this "iff" is obviously a lot stronger than the actual test ;). I seriously doubt this macro is worth anything... > +/* > + * SlabReset > + * Frees all memory which is allocated in the given set. > + * > + * The code simply frees all the blocks in the context - we don't keep any > + * keeper blocks or anything like that. > + */ Why don't we? Seems quite worthwhile? Thinking about this, won't this result in a drastic increase of system malloc/mmap/brk traffic when there's lot of short transactions in reorderbuffer? > +static void > +SlabReset(MemoryContext context) > +{ > + /* walk over freelists and free the blocks */ > + for (i = 0; i <= set->chunksPerBlock; i++) > + { > + dlist_mutable_iter miter; > + > + dlist_foreach_modify(miter, &set->freelist[i]) > + { > + SlabBlock block = dlist_container(SlabBlockData, > node, miter.cur); > + > + dlist_delete(miter.cur); > + > + /* Normal case, release the block */ What does "normal case" refer to here? Given that there's no alternative case... > + /* > + * We need to update index of the next free chunk on the block. If we > used > + * the last free chunk on this block, set it to chunksPerBlock (which is > + * not a valid chunk index). Otherwise look for the next chunk - we know > + * that it has to be above the current firstFreeChunk value, thanks to > how > + * we maintain firstFreeChunk here and in SlabFree(). > + */ > + if (block->nfree == 0) > + block->firstFreeChunk = set->chunksPerBlock; > + else > + { > + /* look for the next free chunk in the block, after the first > one */ > + while ((++block->firstFreeChunk) < set->chunksPerBlock) > + { > + int byte = block->firstFreeChunk / > 8; > + int bit = block->firstFreeChunk % 8; > + > + /* stop when you find 0 (unused chunk) */ > + if (!(block->bitmapptr[byte] & (0x01 << bit))) > + break; > + } > + > + /* must have found the free chunk */ > + Assert(block->firstFreeChunk != set->chunksPerBlock); > + } This and previous code just re-affirms my opinion that a bitmap is not the best structure here. > + /* move the whole block to the right place in the freelist */ > + dlist_delete(&block->node); > + dlist_push_head(&set->freelist[block->nfree], &block->node); Hm. What if we, instead of the array of doubly linked lists, just kept a single linked list of blocks, and keep that list sorted by number of free chunks? Given that freeing / allocation never changes the number of allocated chunks by more than 1, we'll never have to move an entry far in that list to keep it sorted. > +/* > + * SlabRealloc > + * As Slab is designed for allocating equally-sized chunks of > memory, it > + * can't really do an actual realloc. > + * > + * We try to be gentle and allow calls with exactly the same size as in that > + * case we can simply return the same chunk. When the size differs, we fail > + * with assert failure or return NULL. > + * > + * We might be even support cases with (size < chunkSize). That however seems > + * rather pointless - Slab is meant for chunks of constant size, and moreover > + * realloc is usually used to enlarge the chunk. > + * > + * XXX Perhaps we should not be gentle at all and simply fails in all cases, > + * to eliminate the (mostly pointless) uncertainty. I think I'm in favor of that. This seems more likely to hide a bug than actually helpful. Regards, Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers