Ok, let's troll :)  Linus' account of the problem is (as you would
expect) very precise on the low-level side of the force.  Cache
locality is indeed an important problem, and one area where GC Does
Suck (TM).  But there are more solutions than the basic generational-
GC technique that Linus mentions.  First, Escape Analysis-based
optimizations like stack allocation, allows many objects to be
deallocated as soon as they become unused - just like Linus wants.
Second, a base typesystem that includes lightweight objects, like C#
today and very likely Java in JDK 8+, can also help a lot, for example
by packing some tightly-coupled objects together so they always
exhibit excellent memory locality; or by allowing by-value arrays with
the same benefit, etc. Finally, for massive applications which heaps
vastly exceed even the L2/L3 caches - your typical 64-bit JavaEE
server process with 8Gb of heap and hundreds of live threads - the
problem of locality vanishes, because even a competing C application
wouldn't manage a very high cache hit rate, simply because it's always
thrashing the CPUs between tons of concurrent threads and trashing the
caches by pumping gigabytes of data off a DBMS and into webpages and
other outputs, every second.

The ugly part is Linus's talk about the GC's that use refcounting
internally (as in, an actual 'refcount' field stored in each object
and updated at each change of incoming pointers), and even expose it
to the program... is that serious? It suffices to say that GC-by-
refcount is a dead idea, mostly because the update of the refcount
must be thread-safe, so this adds the overhead of at least an atomic
CAS to every single freaking reference update, which is MASSIVE
overhead by today's standards.

Another subtle demonstration of Linus' modest familiarity with GC is
that he does NOT point (even indirectly) at some important problems of
GC that he, as a kernel developer, would certainly use to wipe the
floor with GC's soul:

1) The collector often needs to visit cold memory pages, even ones
that may be swapped to disk. It's indeed horrible to bring a page from
RAM all way up to L1 cache, or even worse from disk to RAM, just to
perform a marking cycle that may even not find anything to deallocate
in that page.  This is one are that receives intense research; there
are already techniques that make the problem much smaller, but it's
not yet in the solved problems category.
2) Higher total memory usage due to semispaces, fragmented slack in
partitioned heaps, and other reasons. (Sometimes not a big problem,
considering that many programs with manual memory reclamation also
waste memory with high heap fragmentation, or due to very complex
object graphs that are risky to dealllocate incrementally so app code
tends to retain many objects longer than necessary, until the
destructor of some "god object" is invoked. Not to mention leaks, much
more common without GC...)

The most interesting part is Linus's claims about mindset. I agree on
the potential problem; but I disagree that it's as severe in the real
world as Linus implies. Looking his example of copy-on-write/sharing,
these techniques are _completely_ independent from memory reclamation
policy. The only issue here is aliasing; if you know that some field
can never "escape to the outer world" (by a public method's return or
other means), then you can precisely control its lifecycle, and you
can easily avoid redundant copies, either with explicit refcounting
and COW, or sometimes not even needing that (when you share the object
to build a new, short-lived structure that you know that won't change
the shared object).

GC is even better here, first because these optimizations are even
easier to do in a GC language, because the programmer knows that
nothing really bad (heap corruption, process crash, or leak) may
happen as a result of a subtle bug, such as a double-free. The worst
kid of bug you can get with these tricks is "logical corruption" (not
copying an object that should have been copied, so one data structure
changes data from another).

More: in theory, a compiler or runtime is able to do alias analysis
and find opportunities to automatically suppress data copies or even
inject refcounting.  This optimization could even be dynamic,
benefiting objects that could be shared because the code allows it
(e.g. public fields), but in practice are never shared in a particular
program execution. In practice these optimizations are tricky because
they have other complications (e.g. not breaking programmer's
assumptions about object identity and sharing of monitors), and I
think the only reason this is not an active area of research is that
the cost/benefit is very small in most cases - and for the exceptions
where it pays off, such as sharing large objects (think stream
buffers) or expensive-to-build objects, we're usually better off with
handcoded optimization anyway.

So this issue boils down to: are Java (and other GC-happy-languages')
programmers less willing to use these optimizations, because theirs
memory management skills are dumbed down by the convenience of GC?  I
think this may be true for the typical run-of-the-mill Java app
programmer, but also very likely true for the typical Joe-sixpack C/C+
+ app programmer (are there any today? well there certainly were,
before Java became popular).  Linus may have a very high regard for
the skills of C programmers, but he's biased by his overwhelming
collaboration with kernel and compiler hackers - not a "typical
programmers" community in any language.

A+
Osvaldo

-- 
You received this message because you are subscribed to the Google Groups "The 
Java Posse" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/javaposse?hl=en.

Reply via email to