Re: GC conservatism -- again

Robert Jacques Wed, 29 Dec 2010 12:45:14 -0800

On Wed, 29 Dec 2010 12:27:02 -0700, Steven Schveighoffer<[email protected]> wrote:

On Wed, 29 Dec 2010 14:00:17 -0500, Robert Jacques <[email protected]>wrote:
On Wed, 29 Dec 2010 07:37:10 -0700, Steven Schveighoffer<[email protected]> wrote:
On Tue, 28 Dec 2010 01:23:22 -0500, Robert Jacques <[email protected]>wrote:
Second, the false pointer problem disappears (for practical purposes)when you move to 64-bit.
I'm not sure I like this "solution", but you are correct. This issomewhat mitigated however by the way memory is allocated (I'massuming not sparsely throughout the address space, and also low inthe address space). It certainly makes it less likely that a 64-bitrandom long points at data, but it's not inconceivable to have 32-bitsof 0 interspersed with non-zero data. It might be likely to have astruct with two ints back to back, where one int is frequently 0.
Ah, but the GC can allocate ram in any section of the address space itwants, so it would be easy for the upper 32-bits to be always non-zeroby design.
huh? How does the GC control whether you set one int to 0 and the othernot?

Hmm, perhaps an example would be best. Consider a 64-bit thread-local GC.It might allocate ram in the following pattern:

[2-bit Shared/Local][16-bit thread ID][6-bit tag][40-bit pointer]

So first, the address space is divided up into 4 regions, [11...] and[00...] are left free for user use (external allocation, memory-mappedfiles, etc), and [01...]/[10...] are used to denote shared/thread-local.Next you have the thread ID, which is one way to support thread-localallocation/thread-local GCs.Then you might have a 6-bit region that lock-free algorithms could use asa tag (and would be ignored by the shared GC), or local GCs could use forinternal purposes.Finally, you have a 40-bit region with what we normally think of as a'pointer'.

The thing to understand, is that 64-bit computers are really 40-bitcomputers, currently. And given that 40-bits will hold us until we get topeta-bytes, we should have 24-bits to add meta-info to our pointers forsome time to come. So, as long as we choose this meta-info carefully, wecan avoid common bit patterns and the associated false pointers.

Third, modern GCs (i.e. thread-local GCs) can further reduce thefalse pointer issue.
I'd rather have precise scanning :) There are issues withthread-local GCs.
The only issue with thread-local GCs is that you can't cast toimmutable and then shared the result across threads. And eventually,well have a) better ways of constructing immutable and b) a deep idup,to mitigate this.
I'm talking about collection cycles -- they necessarily need to scanboth thread local and shared heaps because of the possibility ofcross-heap pointing. Which means you gain very little for thread-localheaps. The only thing it buys you is you can assume the shared heap hasno pointers into local heaps, and local heaps have no pointers intoother local heaps. But if you can't run a collection on just one heap,there is no point in separating them.

The defining feature of thread-local GCs is that they _can_ run collectioncycles independently from each other.

Plus it's easy to cast without moving the data, which is not undefinedcurrently if you take the necessary precautions, but would cause largeproblems with separate heaps.

Casting from immutable/shared would be memory valid, it's casting toimmutable and shared where movement has to occur. As casting betweenimmutable/shared and mutable is logically invalid, I'd expect that thesecases would be rare (once we solve the problem of safely constructingcomplex immutable data)

If we have the typeinfo of a memory block for the GC to parse, you canalso rule out cross-thread pointers without thread-local GCs (forunshared data).
Which basically creates all the problems of thread-local GC, with veryfew of the advantages.
What are the advantages?  I'm not being sarcastic, I really don't know.

-Steve

The major advantage is that they match or out-perform the moderngenerational/concurrent GCs, even when backed by conservative mark-sweepcollectors (according to Apple). (TLGCs can be backed by moderncollectors) Essentially, TLGCs work by separately allocating andcollecting objects which are not-shared between threads. Since thesecollectors don't have to be thread safe, they can be more efficient intheir implementation, and collections only have to deal with their subsetof the heap and their own stack. This reduces pause times and falsepointers, etc. TLGCs are inherently parallel and have interleaved pausetimes, which can greatly reduce the "embarrassing pause" effect (i.e. yourworker thread running out of ram won't pause your GUI, or stop otherworkers from fulfilling http requests).

In D, they become even more important, since to the best of my knowledge Dcan never support generational GCs and only supports concurrent GCs oncertain OSes. So, if D wants a competitive GC, a thread-local GC is theonly cross-platform option I know of. (C function calls prevent mostgenerational/concurrent GC algorithms from being applicable to D)

Re: GC conservatism -- again

Reply via email to