On 05/06/2014 04:04 PM, Robert Haas wrote:
Over the last several months, I've been working on a new memory
allocator for PostgreSQL.  While it's not done, and there are
problems, I think I've reached the point where it makes sense to get
this out in front of a wider audience and get some feedback,
preferably of the sort that doesn't do any permanent damage.  I
started out down this path because parallel sort will really be much
happier if it can allocate from either dynamic shared memory or
backend-local memory using the same interface.  Arguably, we could
implement parallel internal sort using some kind of bespoke memory
management strategy, like packing all of the tuples into the memory
segment tightly starting from the end, and growing the tuple array
from the beginning, but then what happens if we decide to give up on
parallelism and do an external sort after all?  Even if we could
engineer our way around that problem, I think there are bound to be
other things that people want to do with dynamic shared memory
segments where being able to easily allocate and free memory is
desirable.

As a generic remark, I wish that whatever parallel algorithms we will use won't need a lot of ad hoc memory allocations from shared memory. Even though we have dynamic shared memory now, complex data structures with a lot of pointers and different allocations are more painful to debug, tune, and make concurrency-safe. But I have no idea what exactly you have in mind, so I'll just have to take your word on it that this is sensible.

As I got into the problem a little bit further, I realized that
AllocSetAlloc is actually pretty poor allocator for sorting.  As far
as I can see, the basic goal of aset.c was to make context resets
cheap - which makes sense, because we have a lot of very short-lived
memory contexts.  However, when you're sorting a lot of data, being
able to reset the context quickly is not as important as using memory
efficiently, and AllocSetAlloc is pretty terrible at that, especially
for small allocations.  Each allocation is rounded up to a power of
two, and then we add 16 bytes of overhead on top of that (on 64-bit
systems), which means that palloc has 100% overhead for a large number
of 16-byte allocations, and 200% overhead for a large number of 8-byte
allocations.  The 16-byte overhead doesn't hurt quite as much on big
allocations, but the rounding to a power of two can sometimes be very
bad.  For example, for repeated 96-byte allocations, palloc has 50%
overhead.  I decided I wanted to come up with something better.

Yeah, I saw in some tests that about 50% of the memory used for catalog caches was waste caused by rounding up all the allocations to power-of-two.

I read through the literature and found that most modern allocators
seemed to use superblocks.  A superblock is a chunk of memory, perhaps
64kB, which is carved up into a bunch of chunks of equal size.  Those
chunks don't have individual headers; instead, the pointer address is
used to locate the metadata for the chunk.  Requests are binned into
size classes, which are generally much more fine-grained than the
powers-of-two strategy used by AllocSetAlloc.  Of course, too many
size classes can waste memory if you end up with many partially-full
superblocks, each for a different size class, but the consensus seems
to be that you make it up and more by wasting less memory within each
allocation (e.g. a 96 byte allocation can be exactly 96 bytes, rather
than 128 bytes).

Interesting work.

Another kind of memory allocator that I've played with in the past is a simple stack allocator that only supports wholesale release of the whole context and pfree() is a no-op. That's dead simple and very efficient when you don't need retail pfree(), but obviously cannot completely replace the current allocator.

I wouldn't conflate shared memory with this. If a piece of code needs to work with either one, I think the way to go is to have some sort of wrapper functions that route the calls to either the shared or private memory allocator, similar to how the same interface is used to deal with local, temporary buffers and shared buffers.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to