Re: Confusion regarding 'mark-sweep-compact' naming

Gil Tene Fri, 16 Aug 2019 10:09:10 -0700


Sent from my iPad

On Aug 16, 2019, at 2:53 AM, Steve Blackburn 
<[email protected]<mailto:[email protected]>> wrote:

For what it is worth, Kathryn and I have previously clarified some of this 
nomenclature (see our Immix 
paper<http://users.cecs.anu.edu.au/~steveb/pubs/papers/immix-pldi-2008.pdf>).

Marking: identifying liveness via a trace
Sweeping: identifying free space while leaving live objects in place (classic 
mark-sweep and immix do this)
Evacuating: freeing space by moving survivors to another place
Compacting:  freeing space moving survivors within the same place.

:-)

Now we have the “is it Compacting or Evacuating?” question...

Because of Quick Release (C4 paper [1], section 2.5, and originally described 
in 2005 in the Pauseless paper [2], section 4.2) C4 [1] , Pauseless [2], Collie 
[3], Compressor [4], and ZGC [reference needed] are all examples of collectors 
created in the past ~15 years that can be classified as either/both Compacting 
or Evacuating under the nomenclature above.

At first glance, one might be tempted to say that they are simply Evacuating 
collectors: from the point of view of each region, they clearly appear to free 
space by moving survivors to another place (to outside of the region), rather 
then moving them within the same place (the same region).

But from an actual memory space perspective (the actual memory storage used by 
any set of regions being compacted) they are clearly Compacting, since with the 
hand-over-hand compaction enabled by Quick Release, surviving objects are 
clearly moved within the same “place”: the set of actual storage being 
compacted in a given relocate phase, where the survivors are before the moving 
started. These collectors exhibit a tell-tale quality of “in place” compaction: 
they move survivors directly on top where other survivors were just before they 
themselves were moved, all within the same pass.

In virtual memory, these collectors have a clear separation between the two 
places that are the “from space” and the “to space” for a given relocation 
pass. Each is a distinct set of virtual memory ranges, and they do not overlap.

But in actual storage, that is not the case in any of these collectors, as the 
“from space” and “to space” for a given relocation pass overlap with each other 
in physical space.

I obviously consider C4 to be a Compacting collector (it’s right there in the 
name: “The Continuously Concurrent Compacting Collector”), and would therefore 
classify Pauseless, Collie, Compressor, and ZGC as Compacting as well.

But I would also consider them all to be Evacuating regional collectors. 
Because that behavior (in logical/virtual memory, which is where our thoughts 
focus most of the time) clearly fits too. Region evacuation is core to 
Pauseless and C4, and the C4 name was actually a nod at how it frees memory by 
“blowing up” individual regions and moving all live objects “elsewhere”.

However, I wouldn’t call C4/Pauseless/Collie/Compressor/ZGC “in place 
Compactors”, because even tho at the physical storage level that is what 
happens, my brain hurts when I try to keep that picture in my head too long.

We too attempted to further clarify the nomenculture in the Collie paper [3] by 
differentiating different forms of Compaction in terms of whether or not the 
virtual address ranges of the from space and to space overlap, with “in-place 
compactors” being an example of a subset of compactors that exhibit such 
overlap:

“... Compaction of objects involves copying the reachable objects from a 
from-space into mostly contiguous memory locations in a to-space, replacing all 
pointers to from-space locations with corresponding pointers to to-space 
locations and reclaiming from-space memory for re-use. While the “from” and the 
“to” spaces could technically overlap in some collectors (such as the case in 
in-place compactors), this paper is mostly concerned with compactors that use 
non-overlapping from-space and to-space address ranges.”

This description is useful in the Collie paper, because it then delves into the 
ways that  consistency of object state is maintained during compaction, in any 
concurrent compacting collector, and introduces new terms useful in analysis: 
e.g. referrer set, transplantation, the constraints needed for general 
concurrent transplantation, and the key distinction between individually 
transplantable and non-individually transplantable objects that the paper then 
builds in to describe a wait free compactor.

[1] “C4: The Continuously Concurrent Compacting Collector” 
https://www.azul.com/files/c4_paper_acm1.pdf
[2] “The Pauseless GC Algorithm” 
https://www.usenix.net/legacy/events/vee05/full_papers/p46-click.pdf
[3] “The Collie: A Wait-Free Compacting Collector” 
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.406.7321&rep=rep1&type=pdf
[4] “The Compressor: Concurrent, Incremental, and Parallel Compaction” 
https://www.cs.utexas.edu/~speedway/fp031-kermany.pdf

We use the term 'defragmentation' to refer to the use of evacuation to free up 
space in a regional collector.

--Steve

On Friday, 16 August 2019 18:30:10 UTC+10, Aleksey Shipilev wrote:
On 8/16/19 10:07 AM, Gil Tene wrote:
> Classification terms evolve as the art evolves.

What I want to emphasize here that this discussion reinvents the meaning of 
"sweep". That definition
is not used in the way you describe in the sources I know of. Granted, 
definitions drift over time,
but we have to be careful to separate what is the "agreed on" definition, and 
what is whatever
definition we want to be the emergent one.

> On Thursday, August 15, 2019 at 1:42:23 AM UTC-7, Aleksey Shipilev wrote:
>     I am trying to understand what is your central argument here. This seems 
> to be it. Are you saying
>     that "sweeping" is when you visit dead objects, and non-"sweeping" is 
> when you do not?
>
> Yes. I think that's a very good summary.

Would be helpful if we started from this next time around!

>     Is walking _dead_ objects the discriminator for "sweeping" then? So in 
> your book, if we take the
>     same Lisp2 collector, and compute-new-addresses and adjust-pointers steps 
> are walking the
>     self-parsable heap (visiting dead objects along the way), it is M-S-C? 
> But if it uses marking
>     bitmap
>     (thus only visiting live objects), it becomes M-C? [This would be a weird 
> flex, but okay].
>
>
> Exactly. "In my book" adding an efficient livemark bit vector with possible 
> summary layers would
> covert the classic LISP2 GC from a Mark-Sweep-Compact to a Mark-Compact (with 
> no sweep).

So this is what irks me here. In my Epsilon's mark-compact toy collector
(https://shipilev.net/jvm/diy-gc/), changing from mark bitmaps to self-parsable 
heap traversal is
about 5 lines of code. This does not feel like passing the bar for 
reclassification of the collector
algo between major classes. It feels more like the implementation detail that 
can swing both ways,
depending on circumstances.

-Aleksey

--
You received this message because you are subscribed to a topic in the Google 
Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit 
https://groups.google.com/d/topic/mechanical-sympathy/Kk0rQij7q8I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to 
[email protected]<mailto:[email protected]>.
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/869e1ac0-7edd-43ca-afe1-973b0410ff76%40googlegroups.com<https://groups.google.com/d/msgid/mechanical-sympathy/869e1ac0-7edd-43ca-afe1-973b0410ff76%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/F29FB55B-5DE3-413D-A648-1897A28F3EA2%40azul.com.

Re: Confusion regarding 'mark-sweep-compact' naming

Reply via email to