Re: Confusion regarding 'mark-sweep-compact' naming

Steve Blackburn Mon, 19 Aug 2019 04:41:57 -0700

What I said above was not as sharp as it should have been.

I see a crisp distinction between 'evacuation' and 'compaction'.


- Evacuation moves (evacuates) survivors into a pre-existing reserve 
(however small) of heap space.
- Compaction requires no such reserve; survivors are only moved into space 
freed in the process of performing the compaction.

Evacuation may be done at a very fine grain, requiring absolutely minimal 
reserves, or in the other extreme, it may require reserving half the heap 
(semi-space).   Still, I would say 'evacuation' so long as some free 
reserve (however small) is required in order for the algorithm to proceed. 
  Thus while "in-place compaction" is a good description, I'd say it is a 
tautology.

I know this crisp distinction has not been universally observed in the 
literature, but I think it is helpful because it is absolutely crisp, and 
cuts to the very heart of the underlying algorithm.

We use the term 'defragmentation' to describe a fine-grained evacuation 
strategy (for example, one that moves single objects into small holes in 
the heap).   Still, we'd call this an evacuating collector, not a 
compacting collector.   We thought long and hard about this at the time we 
did our work, and settled on the above for the above reasons.

So I'd ask the question: does this algorithm require a reserve of heap 
space (even a small one) before it can proceed.   If so, it's evacuation.

One more thing: obviously we can and do routinely build collectors that are 
hybrids, so there's absolutely no reason why a collector could not do both.

--Steve

On Saturday, 17 August 2019 03:08:47 UTC+10, Gil Tene wrote:
>
>
>
> Sent from my iPad
>
> On Aug 16, 2019, at 2:53 AM, Steve Blackburn <[email protected] 
> <javascript:>> wrote:
>
> For what it is worth, Kathryn and I have previously clarified some of this 
> nomenclature (see our Immix paper 
> <http://users.cecs.anu.edu.au/~steveb/pubs/papers/immix-pldi-2008.pdf>). 
>
> Marking: identifying liveness via a trace
> Sweeping: identifying free space while leaving live objects in place 
> (classic mark-sweep and immix do this)
> Evacuating: freeing space by moving survivors to another place
> Compacting:  freeing space moving survivors within the same place.
>
>
> :-)
>
> Now we have the “is it Compacting or Evacuating?” question...
>
> Because of Quick Release (C4 paper [1], section 2.5, and originally 
> described in 2005 in the Pauseless paper [2], section 4.2) C4 [1] , 
> Pauseless [2], Collie [3], Compressor [4], and ZGC [reference needed] are 
> all examples of collectors created in the past ~15 years that can be 
> classified as either/both Compacting or Evacuating under the nomenclature 
> above.
>
> At first glance, one might be tempted to say that they are simply 
> Evacuating collectors: from the point of view of each region, they clearly 
> appear to free space by moving survivors to another place (to outside of 
> the region), rather then moving them within the same place (the same 
> region).
>
> But from an actual memory space perspective (the actual memory storage 
> used by any set of regions being compacted) they are clearly Compacting, 
> since with the hand-over-hand compaction enabled by Quick Release, 
> surviving objects are clearly moved within the same “place”: the set of 
> actual storage being compacted in a given relocate phase, where the 
> survivors are before the moving started. These collectors exhibit a 
> tell-tale quality of “in place” compaction: they move survivors directly on 
> top where other survivors were just before they themselves were moved, all 
> within the same pass.
>
> In virtual memory, these collectors have a clear separation between the 
> two places that are the “from space” and the “to space” for a given 
> relocation pass. Each is a distinct set of virtual memory ranges, and they 
> do not overlap.
>
> But in actual storage, that is not the case in any of these collectors, as 
> the “from space” and “to space” for a given relocation pass overlap with 
> each other in physical space.
>
> I obviously consider C4 to be a Compacting collector (it’s right there in 
> the name: “The Continuously Concurrent Compacting Collector”), and would 
> therefore classify Pauseless, Collie, Compressor, and ZGC as Compacting as 
> well.
>
> But I would also consider them all to be Evacuating regional collectors. 
> Because that behavior (in logical/virtual memory, which is where our 
> thoughts focus most of the time) clearly fits too. Region evacuation is 
> core to Pauseless and C4, and the C4 name was actually a nod at how it 
> frees memory by “blowing up” individual regions and moving all live objects 
> “elsewhere”.
>
> However, I wouldn’t call C4/Pauseless/Collie/Compressor/ZGC “in place 
> Compactors”, because even tho at the physical storage level that is what 
> happens, my brain hurts when I try to keep that picture in my head too long.
>
> We too attempted to further clarify the nomenculture in the Collie paper 
> [3] by differentiating different forms of Compaction in terms of whether or 
> not the virtual address ranges of the from space and to space overlap, with 
> “in-place compactors” being an example of a subset of compactors that 
> exhibit such overlap:
>
> “... Compaction of objects involves copying the reachable objects from a 
> from-space into mostly contiguous memory locations in a to-space, replacing 
> all pointers to from-space locations with corresponding pointers to 
> to-space locations and reclaiming from-space memory for re-use. While the 
> “from” and the “to” spaces could technically overlap in some collectors 
> (such as the case in in-place compactors), this paper is mostly concerned 
> with compactors that use non-overlapping from-space and to-space address 
> ranges.”
>
> This description is useful in the Collie paper, because it then delves 
> into the ways that  consistency of object state is maintained during 
> compaction, in any concurrent compacting collector, and introduces new 
> terms useful in analysis: e.g. referrer set, transplantation, the 
> constraints needed for general concurrent transplantation, and the key 
> distinction between individually transplantable and non-individually 
> transplantable objects that the paper then builds in to describe a wait 
> free compactor.
>
> [1] “C4: The Continuously Concurrent Compacting Collector” 
> https://www.azul.com/files/c4_paper_acm1.pdf
> [2] “The Pauseless GC Algorithm” 
> https://www.usenix.net/legacy/events/vee05/full_papers/p46-click.pdf
> [3] “The Collie: A Wait-Free Compacting Collector” 
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.406.7321&rep=rep1&type=pdf
> [4] “The Compressor: Concurrent, Incremental, and Parallel Compaction” 
> https://www.cs.utexas.edu/~speedway/fp031-kermany.pdf
>
>
> We use the term 'defragmentation' to refer to the use of evacuation to 
> free up space in a regional collector.
>
> --Steve
>
> On Friday, 16 August 2019 18:30:10 UTC+10, Aleksey Shipilev wrote: 
>>
>> On 8/16/19 10:07 AM, Gil Tene wrote: 
>> > Classification terms evolve as the art evolves. 
>>
>> What I want to emphasize here that this discussion reinvents the meaning 
>> of "sweep". That definition 
>> is not used in the way you describe in the sources I know of. Granted, 
>> definitions drift over time, 
>> but we have to be careful to separate what is the "agreed on" definition, 
>> and what is whatever 
>> definition we want to be the emergent one. 
>>
>> > On Thursday, August 15, 2019 at 1:42:23 AM UTC-7, Aleksey Shipilev 
>> wrote: 
>> >     I am trying to understand what is your central argument here. This 
>> seems to be it. Are you saying 
>> >     that "sweeping" is when you visit dead objects, and non-"sweeping" 
>> is when you do not? 
>> > 
>> > Yes. I think that's a very good summary. 
>>
>> Would be helpful if we started from this next time around! 
>>
>> >     Is walking _dead_ objects the discriminator for "sweeping" then? So 
>> in your book, if we take the 
>> >     same Lisp2 collector, and compute-new-addresses and adjust-pointers 
>> steps are walking the 
>> >     self-parsable heap (visiting dead objects along the way), it is 
>> M-S-C? But if it uses marking 
>> >     bitmap 
>> >     (thus only visiting live objects), it becomes M-C? [This would be a 
>> weird flex, but okay]. 
>> > 
>> > 
>> > Exactly. "In my book" adding an efficient livemark bit vector with 
>> possible summary layers would 
>> > covert the classic LISP2 GC from a Mark-Sweep-Compact to a Mark-Compact 
>> (with no sweep). 
>>
>> So this is what irks me here. In my Epsilon's mark-compact toy collector 
>> (https://shipilev.net/jvm/diy-gc/), changing from mark bitmaps to 
>> self-parsable heap traversal is 
>> about 5 lines of code. This does not feel like passing the bar for 
>> reclassification of the collector 
>> algo between major classes. It feels more like the implementation detail 
>> that can swing both ways, 
>> depending on circumstances. 
>>
>> -Aleksey 
>>
>> -- 
> You received this message because you are subscribed to a topic in the 
> Google Groups "mechanical-sympathy" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/mechanical-sympathy/Kk0rQij7q8I/unsubscribe
> .
> To unsubscribe from this group and all its topics, send an email to 
> [email protected] <javascript:>.
> To view this discussion on the web, visit 
> https://groups.google.com/d/msgid/mechanical-sympathy/869e1ac0-7edd-43ca-afe1-973b0410ff76%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/mechanical-sympathy/869e1ac0-7edd-43ca-afe1-973b0410ff76%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/528f4ea4-080f-46d0-910c-cd68381ccc8c%40googlegroups.com.

Re: Confusion regarding 'mark-sweep-compact' naming

Reply via email to