Re: std.allocator needs your help

Andrei Alexandrescu Mon, 23 Sep 2013 11:14:54 -0700

On 9/23/13 8:32 AM, Manu wrote:

On 23 September 2013 23:58, Andrei Alexandrescu
<[email protected] <mailto:[email protected]>>
wrote:
This is a good first step though, I'm happy to discuss this, but I think
discussion about the practical application may also reveal design
details at this level.

Absolutely. This is refreshing since I've gone through the cycle of"types must be integrated into allocators and with the global GC" ... ->... "untyped allocators can actually be extricated" once. It is nowgreat to revisit the assumptions again.

One matter I'm rather surprised didn't come up (I assumed everyone wouldbe quite curious about it) is that the allocator does not store theallocated size, or at least does not allow it to be discovered easily.This is a definitely appropriate topic for the design at hand.


The malloc-style APIs go like:

void* malloc(size_t size);
void free(void*);

Note that the user doesn't pass the size to free(), which means theallocator is burdened with inferring the size of the block from thepointer efficiently. Given that most allocators make crucial strategicchoices depending on the size requested, this API shortcoming is a baneof allocator writers everywhere, and a source of inefficiency invirtually all contemporary malloc-style allocators.

This is most ironic since there is next to nothing that user code can dowith a pointer without knowing the extent of memory addressed by it. (Anotable exception is zero-terminated strings, another questionabledesign decision.) It all harkens back to Walter's claim of "C's biggestmistake" (with which I agree) of not defining a type for a boundedmemory region.

Upon studying a few extant allocators and D's lore, I decided that in Dthings have evolved sufficiently to have the user pass the size upondeallocation as well:


void[] allocate(size_t size);
void deallocate(void[] buffer);

This is because the size of D objects is naturally known: classes haveit in the classinfo, slices store it, and the few cases of using baldpointers for allocation are irrelevant and unrecommended.

This all makes memory allocation in D a problem fundamentally simplerthan in C, which is quite an interesting turn :o).

No, I certainly understand you mean 'them', but you lead to what I'm
asking, how do these things get carried/passed around. Are they
discreet, or will they invade argument lists everywhere?

Since there's a notion of "the default" allocator, there will be ways topush/pop user-defined allocators that temporarily (or permanently)replace the default allocator. This stage of the design is concernedwith allowing users to define such user-defined allocators withouthaving to implement them from scratch.

Are they free
to flow in/out of libraries in a natural way?
These patterns are what will define the system as I see it.
Perhaps more importantly, where do these allocators get their memory
themselves (if they're not a bottom-level allocator)? Global override
perhaps, or should a memory source always be explicitly provided to a
non-bottom-level allocator?

There are a few obvious bottom allocators, such as malloc, new, sbrk, orWindows heap. All other allocators will compose by parameterizing ontheir parent allocator.

Sure, but the common case is that the author will almost certainly use
keyword 'new'. How can I affect that as a 3rd party?
This would require me overriding the global allocator somehow... which
you touched on earlier.

The way I'm thinking of this is to install/uninstall user-definedallocators that will satisfy calls to new. Since not all allocatorssupport tracing, some would require the user to deallocate stuff manually.

    The library wouldn't need to worry as there would be the notion of a
    default allocator (probably backed by the existing GC).

Right. So it's looking like like the ability to override the global
allocator is a critical requirement.

Again, this is difficult to handle in the same conversation as "shouldwe pass size upon deallocation". Yes, we need road laws, but it's hardto talk about those and engine lubrication in the same breath. For me,"critical" right now is to assess whether the untyped API misses animportant category of allocators, what safety level it has (that's why"expand" is so different from "realloc"!) etc.

        And does that mean that applications+libraries are required to
        ALWAYS
        allocate through given allocator objects?


    Yes, they should.


Then we make keyword 'new' redundant?

Probably not. Typed allocators will need to integrate with (andoccasionally replace) the global shared GC.

    The problem is that installing an allocator does not get to define
    what a pointer is and what a reference is.

Why not?

Because that requires a language change. I'm not sure you realize butyou move the goalposts all the time. We were talking within the contextof libraries and installing allocators dynamically and all of a suddenyou get to change what the compiler thinks a pointer is.

A pointer has a type, like anything else. An ARC pointer can
theoretically have the compiler insert ARC magic.
That does imply though that the allocator affects the type, which I
don't like... I'll think on it.

I talked Walter's ear off a while ago at an ACCU conference about thenotion that reference counting could be a switch passed to the compiler.Recently he's authored a DIP about the compiler inserting refcountingprimitives in the generated code. Unfortunately I think the DIP isawfully incomplete and superficial (to the extent it would bring moredamage to the language if any implementation effort would be invested init at this point).


Can it be done? Probably. But it would be an absolutely major effort.

    These are notions hardwired into the language, so the notion of
    turning a switch and replacing the global GC with a reference
    counting scheme is impossible at the level of a library API.


Indeed it is. So is this API being built upon an incomplete foundation?


The API is quite good at what it puports to do.

    What I do hope to get to is to have allocators define their own
    pointers and reference types. User code that uses those will be
    guaranteed certain allocation behaviors.


Interesting, will this mangle the pointer type, or the object type being
pointed to? The latter is obviously not desirable. Does the former
actually work in theory?

I don't think I understand what you mean. Honest, it seems to me you'reconfused about what you want, how it can be done, and what moving piecesare involved.

One example is Rust, which defines several pointer types to implementits scoping and borrowing semantics. I think it's not easy to achieveits purported semantics with fewer types, and time will tell whetherprogrammers will put up with the complexity for the sake of thecorresponding benefits.

    At this point collect() is only implemented by the global GC. It is
    possible I'll drop it from the final design. However, it's also
    possible that collect() will be properly defined as "collect all
    objects allocated within this particular allocator that are not
    referred from any objects also allocated within this allocator". I
    think that's a useful definition.

Perhaps. I'm not sure how this situation arises though. Unless you've
managed to implement your own GC inside an allocator.


That should be doable within D today.

I think a critical detail to keep in mind, is that (I suspect) people
simply won't use it if it doesn't interface with keyword 'new'.

I think this is debatable. For one, languages such as Java and C++ stillhave built-in "new" but quite ubiquitously unrecommend their usage inuser code. Far as I can tell that's been a successful campaign.

For two, there are allocations that don't even entail calls to new, suchas array concatenation.

It's extremely common that I want to enforce that a library exist
entirely within a designated heap. It can't fall back to the global GC.

I work on platforms where memory is not unified. Different resources
need to go into different heaps.

The proposed API would make that entirely possible. You seem to careonly with the appendix "and mind you I only want to use new for all ofthose!" which is a bit out there. But nevertheless good feedback.



Andrei

Re: std.allocator needs your help

Reply via email to