On Mon, May 14, 2012 at 1:25 PM, Eike Scholz <[email protected]> wrote:
> The impact of a (guided) garbage collector...
>
I am not clear what you mean by "guided". It has been my experience that
attempts to guide the garbage collector from the application are
*almost* always
a mistake.
In a GC'd system, the state of the heap is the cumulative result of a
complex sequence of dynamic operations over a relatively long sequence of
code execution. Because of this, it is not really possible for an
application to know when GC can safely be "turned off". Basically, you
don't have enough information statically about the dynamic heap
state. Because of this, "guided" GC doesn't work well. It's not something
you can really capture in the type system, either. Type systems mainly
serve to describe static conditions rather than dynamic conditions.
There *are* points in *some* applications (notably event loop applications)
where there is reason to know that the application is in a "least retained
memory" state. This is a good point to explicitly invoke GC.
There is also a bunch of static analysis that can be done to automatically
stack-allocate storage and/or identify heap objects that are accessed by a
single thread. Both of these are *really* worthwhile.
What you *can* do is build sub-programs that are known to be non-allocating
in the heap, and therefore will not trigger GC during their execution. It
is helpful if the type system provides some allocation effect analysis to
support writing this type of code.
> ...However GC has speed penalties...
The empirical data says that this isn't true. What *is* true is that (a) GC
amortized storage allocation costs differently, and therefore has an
occasional long pause time (~20ms in modern collectors), and (b)
high-performance GC typically requires 3x-10x more DRAM (real memory, not
virtual) than a comparable program using manual storage. Or at least, these
statements were true before the C4 collector was announced. I'm not sure
how that collector may have changed the answers.
> To handle this problem, I introduced an allocation type, e.g. the
> metacore-programmer
> can force allocation to occur on the call stack, like C automatic
> variables.
> i.e.
>
> tmp var x:FooDataType ;
>
> will force x of data-type FooDataType to be allocated on the stack. In
> conjunction with call by ref procedures
> this allows write C style code regarding local memory. This temp memories
> allows closures and
> dictionaries for (type) classes to be be allocated on the stack if
> possible, by using some simple form of
> allocation-type-inference.
>
These are known as unboxed types. Automatic stack allocation is a good
thing, but it should be done generally. What you need to do this is escape
analysis, which is *not* a simple inference problem in the general case.
Conservative escape analysis is often good enough in practice.
However, note that automatically stack-allocated objects are still in the
heap conceptually, and you still need to pass around pointers to these
objects. It is helpful if your compiler can implement the stack-allocation
pass as a source-to-source (or at least IL to IL) transformation. This
can't be done without region types in the IL.
Finally, note that a copying collector either needs to know where the stack
lives in order to avoid copies, or it needs to be allowed to copy
stack-allocated objects into the heap while performing the copying
collection. Either solution works, but the first solution is preferred. The
problem with the second solution is that it can cause the portion of the
live set that is stored in the current generation to *grow* as a
consequence of collection. This violates one of the key assumptions in
generational collectors, which is that the post-collection live set in the
heap is less than or equal to the pre-collection live set in the heap.
> Question: What do you think, regarding to your experiences with BitC, of
> the following Garbage collector interface design?
>
> Basic Design:
> The rootset is represented by using a stack of tuples
> (pointer-to-memory-area,type-descriptor-for-memory-area) for each thread.
>
Making this per-thread is fine. Each thread should have two root sets: one
for known-thread-local roots and the other for possibly-shared roots. This
lets a clever collector avoid some expensive locking in some important
cases. The Henderson work (see below) can readily be extended for this.
Note that the thread escape problem is very similar to the automatic stack
allocation problem - both require escape analysis.
Most collectors these days are "tagless" collectors - the collector is not
aware of source-level type information. It only knows about which words
contain pointers.
Each heap allocation will automatically push an according tuple to the
> root-set-stack. Stack allocations must be "manually"
> pushed onto the root-set-stack necessary.
>
Heap allocations, by definition, are not roots. The allocation result is
immediately stored into another location - almost invariably somewhere in
the stack frame. The root set consists of:
- Global pointer-typed variables
- The current frame pointer for each thread
- Pointers to your heap that may reside in foreign objects. This
happens in scripting and native call scenarios.
You also need to consider how object pinning is to be implemented. This is
necessary for I/O purposes, and also for binding to the native system call
interface.
Since you are implementing in C, you should look at:
Fergus Henderson, "Accurate Garbage Collection in an Uncooperative
Environment"
Rafkind, Wick, Regehr, and Flatt, "Precise Garbage Collection for C"
> Further there is a resource set , i.e. a set of pointers, where I have not
> jet definitely decided how to present this, that holds pointers
> to all data that are on the heap.
>
This resource set is, by definition, the list of canonical roots that I
gave above.
Jonathan
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev