On Wed, 10 Feb 2016 08:57:51 +0000, thedeemon wrote: > Currently (at least last time I checked) GC pauses the world, then does > all the marking in one thread, then all the sweeping.
Right. > We can do the > marking in several parallel threads (this is much harder to implement > but still doable), Parallel marking would not be a breaking change by any means. No user code runs during GC collections, so we can do anything. The major fly in the ointment is that creating threads normally invokes the GC, since Thread is an object, and invoking the GC during a collection isn't the best. This can be solved by preallocating several mark threads. Then you just divide the stack and roots between those threads. Moderately annoying sync issues This doesn't guarantee an even distribution of work. You can solve that problem with a queue, though that requires locking. The main wrinkle is writing a bit to shared data structures, which can be slow. On the other hand, in the mark phase, we're only ever going to write the same value to each, so it doesn't matter if GC thread A . I don't know how to tell the CPU that it doesn't have to read back the memory before writing it. > and we can kick the sweeping out of stop-the-world > pause and do the sweeping lazily This would be a breaking change. Right now, your destructors are guaranteed to run when no other code is running. You'd need to introduce locks in a few places. I'm not saying this is a bad thing. I think people generally wouldn't notice if we made this change. But some code would break, so we'd have to stage that change. Anyway, I'm hacking up parallel mark phase to see how it would work. I could use some GC benchmarks if anyone's got them lying around.