That's a fantastic response, thanks Osvaldo! -- Cédric
On Sun, Apr 24, 2011 at 11:37 AM, opinali <[email protected]> wrote: > Ok, let's troll :) Linus' account of the problem is (as you would > expect) very precise on the low-level side of the force. Cache > locality is indeed an important problem, and one area where GC Does > Suck (TM). But there are more solutions than the basic generational- > GC technique that Linus mentions. First, Escape Analysis-based > optimizations like stack allocation, allows many objects to be > deallocated as soon as they become unused - just like Linus wants. > Second, a base typesystem that includes lightweight objects, like C# > today and very likely Java in JDK 8+, can also help a lot, for example > by packing some tightly-coupled objects together so they always > exhibit excellent memory locality; or by allowing by-value arrays with > the same benefit, etc. Finally, for massive applications which heaps > vastly exceed even the L2/L3 caches - your typical 64-bit JavaEE > server process with 8Gb of heap and hundreds of live threads - the > problem of locality vanishes, because even a competing C application > wouldn't manage a very high cache hit rate, simply because it's always > thrashing the CPUs between tons of concurrent threads and trashing the > caches by pumping gigabytes of data off a DBMS and into webpages and > other outputs, every second. > > The ugly part is Linus's talk about the GC's that use refcounting > internally (as in, an actual 'refcount' field stored in each object > and updated at each change of incoming pointers), and even expose it > to the program... is that serious? It suffices to say that GC-by- > refcount is a dead idea, mostly because the update of the refcount > must be thread-safe, so this adds the overhead of at least an atomic > CAS to every single freaking reference update, which is MASSIVE > overhead by today's standards. > > Another subtle demonstration of Linus' modest familiarity with GC is > that he does NOT point (even indirectly) at some important problems of > GC that he, as a kernel developer, would certainly use to wipe the > floor with GC's soul: > > 1) The collector often needs to visit cold memory pages, even ones > that may be swapped to disk. It's indeed horrible to bring a page from > RAM all way up to L1 cache, or even worse from disk to RAM, just to > perform a marking cycle that may even not find anything to deallocate > in that page. This is one are that receives intense research; there > are already techniques that make the problem much smaller, but it's > not yet in the solved problems category. > 2) Higher total memory usage due to semispaces, fragmented slack in > partitioned heaps, and other reasons. (Sometimes not a big problem, > considering that many programs with manual memory reclamation also > waste memory with high heap fragmentation, or due to very complex > object graphs that are risky to dealllocate incrementally so app code > tends to retain many objects longer than necessary, until the > destructor of some "god object" is invoked. Not to mention leaks, much > more common without GC...) > > The most interesting part is Linus's claims about mindset. I agree on > the potential problem; but I disagree that it's as severe in the real > world as Linus implies. Looking his example of copy-on-write/sharing, > these techniques are _completely_ independent from memory reclamation > policy. The only issue here is aliasing; if you know that some field > can never "escape to the outer world" (by a public method's return or > other means), then you can precisely control its lifecycle, and you > can easily avoid redundant copies, either with explicit refcounting > and COW, or sometimes not even needing that (when you share the object > to build a new, short-lived structure that you know that won't change > the shared object). > > GC is even better here, first because these optimizations are even > easier to do in a GC language, because the programmer knows that > nothing really bad (heap corruption, process crash, or leak) may > happen as a result of a subtle bug, such as a double-free. The worst > kid of bug you can get with these tricks is "logical corruption" (not > copying an object that should have been copied, so one data structure > changes data from another). > > More: in theory, a compiler or runtime is able to do alias analysis > and find opportunities to automatically suppress data copies or even > inject refcounting. This optimization could even be dynamic, > benefiting objects that could be shared because the code allows it > (e.g. public fields), but in practice are never shared in a particular > program execution. In practice these optimizations are tricky because > they have other complications (e.g. not breaking programmer's > assumptions about object identity and sharing of monitors), and I > think the only reason this is not an active area of research is that > the cost/benefit is very small in most cases - and for the exceptions > where it pays off, such as sharing large objects (think stream > buffers) or expensive-to-build objects, we're usually better off with > handcoded optimization anyway. > > So this issue boils down to: are Java (and other GC-happy-languages') > programmers less willing to use these optimizations, because theirs > memory management skills are dumbed down by the convenience of GC? I > think this may be true for the typical run-of-the-mill Java app > programmer, but also very likely true for the typical Joe-sixpack C/C+ > + app programmer (are there any today? well there certainly were, > before Java became popular). Linus may have a very high regard for > the skills of C programmers, but he's biased by his overwhelming > collaboration with kernel and compiler hackers - not a "typical > programmers" community in any language. > > A+ > Osvaldo > > -- > You received this message because you are subscribed to the Google Groups > "The Java Posse" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/javaposse?hl=en. > > -- Cédric -- You received this message because you are subscribed to the Google Groups "The Java Posse" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/javaposse?hl=en.
