On Tuesday, 24 September 2013 at 16:06:39 UTC, Andrei Alexandrescu wrote:
On 9/24/13 4:38 AM, Dan Schatzberg wrote:
One thing I'm not sure is addressed by this design is memory locality. I know of libnuma http://linux.die.net/man/3/numa which allows me to express what NUMA domain my memory should be allocated from at run-time
for each allocation.

In the case that I want to allocate memory in a specific NUMA domain (not just local vs non-local), I believe this design is insufficient
because the number of domains are only known at run-time.

Also, as far as alignment is concerned I will throw in that x86 is relatively unique in having a statically known cache-line size. Both ARM and PowerPC cores can differ in their cache-line sizes. I feel this is a significant argument for the ability to dynamically express alignment.

Could you send a few links so I can take a look?

My knee-jerk reaction to this is that NUMA allocators would provide their own additional primitives and not participate naively in compositions with other allocators.


Andrei

Not sure what kind of links you're looking for

The following link is a good discussion of the issue and the current solutions
http://queue.acm.org/detail.cfm?id=2513149

In particular:

"The application may want fine-grained control of how the operating system handles allocation for each of its memory segments. For that purpose, system calls exist that allow the application to specify which memory region should use which policies for memory allocations.

The main performance issues typically involve large structures that are accessed frequently by the threads of the application from all memory nodes and that often contain information that needs to be shared among all threads. These are best placed using interleaving so that the objects are distributed over all available nodes."

The Linux/libc interfaces are linked in my first comment. Specifically with the mbind() call one can specify the policy for allocations from a virtual address range (which NUMA node to allocate the backing physical page from). More generally you could imagine specifying this per allocation.

What is your objective though? Aren't you trying to define a hierarchy of allocators where more specific allocators can be composed from general ones? In which case what is the concern with including locality at the base level? It seems to be one characteristic of memory that programmers might be concerned with and rather trivially you can compose a non-locality aware allocator from a locality aware allocator.

Reply via email to