On Tuesday, 20 January 2015 at 22:25:05 UTC, deadalnix wrote:
Any serious GC can run concurrently (instead of stopping the
world). That mean any serious GC must be able to handle
allocations while collecting.
Concurrent GC is too expensive for a proper system level
language. Stopping the thread/world is OK if you:
1. Statically determine what you need to scan to get full
coverage (by static typing) so that you discriminate/classify
pointers effectively.
2. Cluster all the pointers that needs scanning on the same cache
lines by design.
3. Use exact (precise) scanning.
4. Use a collector that is carefully written for cache locality
and minimize cache misses using batching.
5. Generate the runtime to take advantage of information from
pre-linking static analysis.
Remember that the memory bus can push up to 6GB/s. So in 5ms you
can read up to 30MB, which is roughly 400.000 cache lines. So,
you should be able to do fine with 50-100.000 GC allocated
objects.
That means a mix of regular heap and GC convenience is possible,
even in a real time app, but you have to design for it all the
way around (inclusive language constructs).
Not only handling these case is unlikely to make the GC any
slower, but in fact, this is required to make the GC faster.
Deallocators means you have to track object boundaries and make
partial deallocation (deallocating a class instance except a live
member that still has a reference to it) more tricky.
K.I.S.S. + limited use of GC + GC designed language constructs +
good static analysis + generated runtime => fast collection if
you write for it.