> On Aug 14, 2019, at 10:40 AM, Aleksey Shipilev <[email protected]> 
> wrote:
> 
> On 8/14/19 11:30 AM, Gil Tene wrote:
>> For an easy to follow code example of Mark, Sweep, and Compact, look at the 
>> (ahem...) psMarkSweep.c
>> code for the compacting OpenJDK ParallelGC oldgen implementation, which does 
>> 4 separate passes (
>> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/7576bbd5a03c/src/share/vm/gc_implementation/parallelScavenge/psMarkSweep.cpp#l201
>>  ):
>> 
>> 1. A  tracing, pointer chasing pass that marks the live set.
>>     Line 514: // Recursively traverse all live objects and mark them
>> 
>> 2. A sweeping pass that computes the new object addresses
>>     Line 579: // Now all live objects are marked, compute the new object 
>> addresses.
>>     (new addresses are computed to ”shift everything to one side”,
>>      which, when actually done in pass 4, will compact the heap in place)
>> 
>> 3. A sweeping pass to adjust references to point to the new object addresses:
>>     Line 598: // Adjust the pointers to reflect the new locations
>>     (this is sometimes referred to as a fixup pass)
>> 
>> 4. A compact pass that moves the objects to their previously chosen target 
>> locations.
>>     Line 645: // All pointers are now adjusted, move objects accordingly
>> 
>> This is a classic Mark, Sweep, and Compact collector. Pass 1 only visits  
>> live objects, while passes
>> 2-4 linearly traverse the entire heap, live or dead), going forward one 
>> object at a time, and
>> skipping over the dead objects one by one.
> 
> I think the descriptions of steps (2) and (3) seriously stretch the 
> definition of "sweep". If we
> creatively redefine it to mean any (linear) heap walk, then most (all, except 
> for copying?)
> collectors should be marked as "sweep", which defies the purpose of the 
> definition.
> 
> This describes Lisp2-style collector, which is listed as (*drumroll*) 
> "Mark-Compact":
>  https://en.wikipedia.org/wiki/Mark-compact_algorithm#LISP2_algorithm
> 
> GC handbook mentions only "Mark-Compact" and "Mark-Sweep" styles, no 
> "Mark-Sweep-Compact".

Yup. those parts were written before Quick Release was commonly used to makes 
pure
Mark/Compact (no sweep) practical and profitable (over the other two variants).
I think that the G1GC evacuation stuff was also "too young" at the time of the 
writing to
consider the implications on terminology there, as G1GC-style collectors, and 
(I think Shenandoah
as well, since they share the same SATB marker logic), tend to keep separate 
live marks in
bits outside of object header for the same avoid-the-sweep reasons, and don't 
visit dead
objects when evacuating regions.

> The MM
> Glossary [https://www.memorymanagement.org/glossary/s.html#sweeping] says: 
> "Sweeping is the second
> phase (“the sweep phase”) of the mark-sweep algorithm. It performs a 
> sequential (address-order) pass
> over memory ***to recycle unmarked blocks.***" (emphasis mine)

Pass 2 above does exactly the "sequential (address-order) pass over memory to 
recycle unmarked blocks"
thing, and that makes it a classic sweep. It does a classic "find, visit, 
parse, and track all the dead stuff"
pass, and will use that information to recycle all of them (by compacting those 
holes). The way it tracks
the "free stuff" is just one of many possible ways to track the recycle-able 
holes identified in a sweep:
It 2 tracks them by reflecting the information about where the the dead holes 
are in the newly computed
address for the next live object it finds (the "how much to shift the next live 
object's address" amount
grows by the size of each dead object encountered and parsed to determine that 
size). If you wanted to,
you could directly deduce or reconstruct the list of dead hole addresses and 
sizes by walking the newly
computed object addresses backwards. It's just that the way this "list" is 
stored is optimized for the next
passes in this collector.

The way different sweepers track the recycle-able holes they find varies 
dramatically (lists, trees, bucketed
lists, buddy lists, some magic encoding in other data) but the work done to 
find them is fundamentally the
same in all sweepers: they linearly parse the heap, one object at a time, 
figuring out where the next object
starts by looking the at material within the object (dead or alive) being 
visited, and store (or embed) the
information they find about where the holes are into some data structure.

The fact that they visit and parse dead objects in order to identify 
recycle-able memory is what makes them
sweepers. Other techniques that don't visit dead objects do not perform a 
sweep. Those usually take the
apparoiach of "take all the live objects somewhere else, and the whole thing 
becomes one big free blob"
over the "sweep through the individual dead things and rack them for some 
recycling operation that fills
them in individually". Examples of those clearly-non-sweepers include copying 
collectors that evacuate and
fixup an entire from space in one tracing pass that only ever visits live 
objects. Regional evacuating
collectors that evacuate one region at a time (as opposed to an entire from 
space in an all-or-nothing
approach) but do so without ever visiting or parsing a dead object also fall 
into this "does not do a sweep"
category.

> 
> I would leave "sweep" as something that reclaims the storage of the objects 
> (i.e. "sweeps away" the
> garbage), which makes M-C and M-S classes more clearly defined and easier to 
> reason about.

When M-C implies "also does a full linear traversal that visits all dead 
objects" (aka "a sweep"), that
terminology fails. Examples like C4, G1GC, Shenandoah, and ZGC are all 
Mark-Compact (NO Sweep)
collectors, while ParallelGC and Serial GC are Mark-Compact (WITH Sweep) 
collectors. Since the math
behind those varies dramatically in wasy that effect everyday choices, using 
"M-C" as a classification
that covers both leads people to misunderstand the interactions between e.g. 
heap size and GC work.

For example, I know that for C4 and ZGC (and I believe that for Shenandoah as 
well), all of which
are variants of Mark/Compact (the non-sweeping kind), increasing the heap 
without increasing the
application live set adds virtually no cost to the cost the collector bears per 
GC cycle, and as a
result, it improves the collector's efficiency linearly (i.e. doubling the heap 
size improves the efficiency
by 2x) without elongating any GC work time (pausing or not). This quality makes 
"use more heap if
you ahve the memory for it" an always-good thing to do (it never hurts).

But in contrast, in ParallelGC and SerialGC, whose oldgen collectors are also 
variants of Mark/Compact
(the also sweeping kind). increasing the oldgen heap size significantly 
increase the work the collector
does in each cycle, since the sweeping passes will fundamentally take longer to 
perform. This creates
a tradeoff between efficiency and other things. A larger heap for a given live 
set does mean less frequent
collection, and overall less work per allocation unit, since even as each 
collection gets individually longer,
some parts of each cycle (the tracing and object moving parts) stay the same 
length, so the of increase
in cycle length is smaller than the decrease in frequency. But if cycle time 
matters for something other
than efficiency (like pause time in  a STW collector), or if longer cycle times 
end up costing elsewhere
(heuristics causing more frequent triggering in a concurrent collector to leave 
more room for the cycle to
complete), those efficiency benefits may be offset, and "use a larger heap" is 
an often
"Not the right thing to do", and can seriously hurt.

The difference between those two comes down to the difference between 
Mark-Compact (no Sweep)
and Mark-Compact (with Sweep).

> 
> Introducing the fixup pass is not very relevant here, because it talks about 
> how the heap references
> are managed, not how the object storage is treated. Regardless of what modern 
> concurrent collector
> does (separate update-refs phase, piggybacking on next marking, self-fixing 
> barriers) to update the
> references, the objects are already either copied/compacted away, or the 
> garbage is swept away in
> (per-object-sized) blocks.

The fixup ia simply an integral part of compacting, as compacting is not 
complete until the
fixup is complete. I agree that it is not part of the Sweep part. But whether 
or not a separate
visit-every-live-object fixup pass happens or not in a compacting collector has 
as direct
a relationship on it's overall cost as whether or not a separate 
visit-all-the-dead-objects
sweep pass is done.

This difference is not as critical in very parse heaps (like newgen in most 
cases, or in Oldgens
that are sized to only have a small fraction of the heap size live), because 
the cost considerations
around doing or of not doing a visit-all-dead-objects sweep greatly out-weigh 
any costs
considerations around doing/not-doing a separate visit-all-the-live-objects 
pass for fixup in
those cases.

But it becomes fundamental at larger heap occupancies, where the live set may 
be a significant
portion of the overall heap size. This does not show up in silly things like 
SpecJBB (which
tends to have a alive set of ~1GB but is usually run with heaps in the tens of 
GB or higher)
but is commonplace in large-in-heap state situations that are actually trying 
to make use of
memory (like any form on in-memory cache or data, and e.g. in Spark, 
Elastic/Solr, Cassandra,
Hadoop-name-nodes, or 
name-your-app-that-actually-keeps-lostof-stuff-in-the-heap)

When heap utilizations are higher (e.g. if you actually have 40GB or more of 
live data in that
60GB heap), the difference in cost between doing two passes (one tracing 
MarkFixup, one
linear relocate), or three passes (one tracing Mark, one linear relocate, and 
an additional (either
linear or tracing) fixup is huge. In the end, the relevant live data (the 
object headers and bodies)
will either be brought into your CPI caches twice or 3 times. That's a 50% 
difference. And it is
as big as the difference of doing or not doing a sweep.

But we probably don't have room for the "with separate fixup pass" vs. "without 
separarte fixup pass"
distinction in the already mixed up Mark-Compact subcategories.


> 
> -Aleksey
> 

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/2BC14275-7C5B-4783-881B-9DA433A249DC%40azul.com.

Attachment: signature.asc
Description: Message signed with OpenPGP

Reply via email to