Changes in the D2 design to help the GC?

bearophile Wed, 15 Jul 2009 14:35:18 -0700

In Java the GC is able to collect garbage very quickly, so people in Java 
allocate many small objects quite often.
In functional-style languages, like Scala, Clojure, F#, etc, most data is 
immutable, so again the GC has lot of pressure in allocating and freeing many 
small structures all the time.


D2 syntax allows both styles of programming (you can program in D almost as 
Java, if you want), but if you follow one of those two styles of programming 
you will see that the current D GC is much less efficient, and leads to low 
performance, compared to Java/F#. (Scoped classes are not enough).

I am not expert of GCs yet, but I'm certain there are ways to improve the 
current situation. Beside improving the GC itself, there can be ways to modify 
a bit the current design of D2 to help the design of a more efficient GC. Do 
you have ideas?

Time ago I have suggested to split the D pointers in two types, the GC-managed 
ones and the ones that work on the C heap, that the GC never touches. The type 
system can assure they never get mixed by mistake. Now I think (just an idea) 
the type of GC-managed pointers can be split in two types: the ones that are 
fully managed by a moving GC (see below) and the ones managed by a conservative 
GC, such memory is pinned, and the GC doesn't move it around. The type system 
will assure such three groups doesn't mix unless the programmer is really 
determined to mix them :-)

A simple idea of mine to improve the GC (not to change the D2 language yet) is 
to split the D GC in two parts, one is a moving one, that acts like a 
Java-style GC, especially useful in SafeD code, such GC will become the one 
used in OOP/functional-style code, probably it is the GC that will be used in 
most of the code of most D programs. A second part of the GC acts in a 
conservative way, like the current GC, it's safer. The second part of the GC 
manages "pinned" blocks of memory, that can't be moved, such memory is usually 
the one managed in lower level D modules, by user-written collections, etc. The 
performance of this second part of the GC will be lower (like the current one), 
but most data will not be managed by it anyway.

When you use LDC the slow GC is one of the few parts of D language that have 
low performance still (the other two part are that currently D isn't able to 
inline closures and virtual methods. Such things too will eventually need to be 
addressed if D wants to become high-performance. I can leave such topic to 
other posts/threads).

Bye,
bearophile

Changes in the D2 design to help the GC?

Reply via email to