Sean Kelly wrote:
Andrei Alexandrescu wrote:
KennyTM~ wrote:
The best solution I can think of, without compiler modification is a
struct/class that contains a static array member T[1024] and a
dynamic array member T[] initialized to null; and the code chooses
which member to use in the constructor. But this always occupies
1024*T.sizeof bytes and there will always be a conditional (if)
sticked to all access methods.
This entire discussion gets me thinking - could an alternate stack be
a good (or even better) solution? Consider a global per-thread
"superstack" - a growable region that allocates memory in large chunks
and makes sub-chunks accessible in a strictly LIFO manner. The
primitives of the superstack are as follows:
void* SuperStack.allocate(size_t bytes);
void SuperStack.free(size_t bytes);
size_t SuperStack.slack();
The SuperStack's management is a singly-linked list of large blocks.
It's an implementation detail, but when the owner thread dies it should
null the links for all the list nodes, assuming the data contained in
those nodes can be shared across threads.
Damn straight it could free() them assuming it has malloc()ated them!
With the SuperStack in place, code could look like this:
void foo(in size_t s)
{
auto a = SuperStack.array(int)(s, Uninitialized.yeahIKnow);
scope(exit) SuperStack.free(s);
... play with a ...
}
Is there interest for such a thing?
I suppose I'm wondering what the advantage is of this approach over
simply using dynamic memory.
Speed. An allocation/deallocation is most of the time a counter bump and
a check.
Is it the fact that allocations are
non-locking?
No.
Pure stack allocations make sense because all such
allocations are scoped and guaranteed to be extremely fast. So I
suppose this is simply a parallel local "stack" for the same purpose and
the goal is to avoid the assembly magic necessary for an alloca() type
solution? And if this is the case, then I assume that sharing would be
illegal (at least on D2)?
It is better than the stack because it can tap into the entire heap, as
opposed to whatever was reserved for the stack. You can also expand an
array on the superstack, whereas you can't with alloca. (I forgot to
provide that primitive.) Yah, and indeed recall that D2 is all built
around no implicit sharing across threads.
My initial reaction is that I'm skeptical of any attempt at manual
replacement of built-in functionality. It's been shown, for example,
that custom allocators often perform worse than the default allocators.
I seem to recall that. Oh, wait, I gave a talk on that, several times
:o). Emery Berger studied the problem. It turns out user-defined heaps
are usually worse off, with one exception: regions. SuperStack is a
region. That's why I'm counting it could make a difference.
That aside, the general approach does seem reasonable. Though I'd
still prefer we just get real stack-based dynamic allocation in D. It
seems quite natural that "scope" should provide this for arrays as a QOI
feature, as it does for classes.
I agree. There are a few problems. One is that loops can make things
tricky, as you need to call alloca only once. The other has to do with
the way Walter generates code, I forgot the details, but it's fixable if
some time is invested. And the third is, as I said, stack allocation can
actually fail long before the system runs out of memory. It has happened
to me.
Andrei