Ok, let's troll :) Linus' account of the problem is (as you would expect) very precise on the low-level side of the force. Cache locality is indeed an important problem, and one area where GC Does Suck (TM). But there are more solutions than the basic generational- GC technique that Linus mentions. First, Escape Analysis-based optimizations like stack allocation, allows many objects to be deallocated as soon as they become unused - just like Linus wants. Second, a base typesystem that includes lightweight objects, like C# today and very likely Java in JDK 8+, can also help a lot, for example by packing some tightly-coupled objects together so they always exhibit excellent memory locality; or by allowing by-value arrays with the same benefit, etc. Finally, for massive applications which heaps vastly exceed even the L2/L3 caches - your typical 64-bit JavaEE server process with 8Gb of heap and hundreds of live threads - the problem of locality vanishes, because even a competing C application wouldn't manage a very high cache hit rate, simply because it's always thrashing the CPUs between tons of concurrent threads and trashing the caches by pumping gigabytes of data off a DBMS and into webpages and other outputs, every second.
The ugly part is Linus's talk about the GC's that use refcounting internally (as in, an actual 'refcount' field stored in each object and updated at each change of incoming pointers), and even expose it to the program... is that serious? It suffices to say that GC-by- refcount is a dead idea, mostly because the update of the refcount must be thread-safe, so this adds the overhead of at least an atomic CAS to every single freaking reference update, which is MASSIVE overhead by today's standards. Another subtle demonstration of Linus' modest familiarity with GC is that he does NOT point (even indirectly) at some important problems of GC that he, as a kernel developer, would certainly use to wipe the floor with GC's soul: 1) The collector often needs to visit cold memory pages, even ones that may be swapped to disk. It's indeed horrible to bring a page from RAM all way up to L1 cache, or even worse from disk to RAM, just to perform a marking cycle that may even not find anything to deallocate in that page. This is one are that receives intense research; there are already techniques that make the problem much smaller, but it's not yet in the solved problems category. 2) Higher total memory usage due to semispaces, fragmented slack in partitioned heaps, and other reasons. (Sometimes not a big problem, considering that many programs with manual memory reclamation also waste memory with high heap fragmentation, or due to very complex object graphs that are risky to dealllocate incrementally so app code tends to retain many objects longer than necessary, until the destructor of some "god object" is invoked. Not to mention leaks, much more common without GC...) The most interesting part is Linus's claims about mindset. I agree on the potential problem; but I disagree that it's as severe in the real world as Linus implies. Looking his example of copy-on-write/sharing, these techniques are _completely_ independent from memory reclamation policy. The only issue here is aliasing; if you know that some field can never "escape to the outer world" (by a public method's return or other means), then you can precisely control its lifecycle, and you can easily avoid redundant copies, either with explicit refcounting and COW, or sometimes not even needing that (when you share the object to build a new, short-lived structure that you know that won't change the shared object). GC is even better here, first because these optimizations are even easier to do in a GC language, because the programmer knows that nothing really bad (heap corruption, process crash, or leak) may happen as a result of a subtle bug, such as a double-free. The worst kid of bug you can get with these tricks is "logical corruption" (not copying an object that should have been copied, so one data structure changes data from another). More: in theory, a compiler or runtime is able to do alias analysis and find opportunities to automatically suppress data copies or even inject refcounting. This optimization could even be dynamic, benefiting objects that could be shared because the code allows it (e.g. public fields), but in practice are never shared in a particular program execution. In practice these optimizations are tricky because they have other complications (e.g. not breaking programmer's assumptions about object identity and sharing of monitors), and I think the only reason this is not an active area of research is that the cost/benefit is very small in most cases - and for the exceptions where it pays off, such as sharing large objects (think stream buffers) or expensive-to-build objects, we're usually better off with handcoded optimization anyway. So this issue boils down to: are Java (and other GC-happy-languages') programmers less willing to use these optimizations, because theirs memory management skills are dumbed down by the convenience of GC? I think this may be true for the typical run-of-the-mill Java app programmer, but also very likely true for the typical Joe-sixpack C/C+ + app programmer (are there any today? well there certainly were, before Java became popular). Linus may have a very high regard for the skills of C programmers, but he's biased by his overwhelming collaboration with kernel and compiler hackers - not a "typical programmers" community in any language. A+ Osvaldo -- You received this message because you are subscribed to the Google Groups "The Java Posse" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/javaposse?hl=en.
