On Mon, 27 Dec 2010 09:12:53 -0700, Steven Schveighoffer
<[email protected]> wrote:
While fixing a design issue in druntime, I re-discovered how crappy the
conservative GC can be in certain situations.
The issue involves the array appending cache, which is used to
significantly speed up array appends. Essentially, since the array
appending cache was scanned as containing pointers, it would 'hold
hostage' any arrays that were appended to. As it turns out, this was
necessary, because there was no mechanism to update the cache when the
blocks were collected by the GC. I added this mechanism, and discovered
-- it didn't help much :)
The test case I was using was posted to the newsgroup a few months
back. Basically, the test was to append to an array until it consumed
at least 200MB. A single test takes a while, but what's more disturbing
is, if you run the same test again, the memory used for the first test
*isn't released*. My first thought was that the array append cache was
holding this data hostage, but that was not the problem.
The problem is that when you allocate 1/20th the address space of the
process to one contiguous memory block, the chances that the
conservative GC will detect a false pointer into that block are very
high. What's worse, if the 'pointer' into the block is somewhere in
TLS, global data, or high on the stack, that block is stuck for pretty
much the life of the program or thread.
So I was thinking of possible ways to solve this problem. Solving it
perfectly is not really possible unless we implement precise scanning in
all areas of memory (heap, stack, global data). While that certainly
*could* be a possibility, it's not likely to happen any time soon.
What about tools to make deallocation easier? For example, we have
scope(exit) that you could potentially use to ensure a memory block is
deallocated on exit from a scope, what about a thread exit? What about
declaring a scope object at a high level that nested scopes could use to
deallocate from? Making this a bit easier might be a good alternative
while precise scanning hasn't been adopted yet.
Any other ideas?
-Steve
First, I'd like to point out that precise scanning of the heap (and I'll
assume this can be extended to globals), is a long standing enhancement
request. It's issue 3463
(http://d.puremagic.com/issues/show_bug.cgi?id=3463). It does have a
patch, but it's now out of date and needs someone to update it (hint,
hint). Second, the false pointer problem disappears (for practical
purposes) when you move to 64-bit. Third, modern GCs (i.e. thread-local
GCs) can further reduce the false pointer issue.