Re: Confusion regarding 'mark-sweep-compact' naming

Gil Tene Mon, 14 Aug 2017 08:30:48 -0700


On Sunday, August 13, 2017 at 11:45:09 PM UTC-7, Kirk Pepperdine wrote:
>
> Hi, 
>
>
> > On Aug 14, 2017, at 7:47 AM, Peter Veentjer <[email protected] 
> <javascript:>> wrote: 
> > 
> > I have been improving my gc knowledge and there is some confusion on my 
> side regarding the 'mark-sweep-compact' algorithm I frequently see 
> mentioned in posts, articles and some not too formal books on the topic. 
> E.g. 
> > 
> > AFAIK mark-sweep-compact is not correct. It is either mark-sweep or 
> mark-compact where as part of the compacting process the dead objects are 
> removed. When I check more formal literature like "The Garbage Collection 
> Handbook: the Art of Automatic Memory Management", then there is no 
> mentioning of mark-sweep-compact. 
> > 
> > So I guess that when 'mark-sweep-compact' is mentioned, they are 
> actually referring to 'mark-compact'. Is this assumption correct? 
>
> The terminology around this is very loose so the very short answer is…. 
> sort of. So, a very terse but hopefully useful summary of collectors. 
>
> When we talk about mark-sweep we are generally referring to a class of 
> garbage collectors known as tracing collectors. The actual details of how 
> the mark and sweep for each of the different tracing collectors is 
> implemented depend upon a couple of factors. One of the most important 
> factors is how memory pool(s) (data structure in heap space) is organized. 
> I tend to label a collector as either “in-place” or “evacuation”. The most 
> trivial examples for both are; if you only have a single memory pool then 
> you will be performing an “in-place” collection as the data remains in the 
> that memory pool at the end of the collection. If you have two spaces you 
> can set this up as what is known has hemispheric collections. In this case 
> you mark the data in the currently active memory pool and then “evacuate” 
> it to the inactive pool. When complete active becomes inactive (thus empty) 
> and inactive becomes active. Wash, rinse, repeat…. 
>
> In-place collections will need a free-list and need to some how deal with 
> fragmentation (of the free-list). Thus they compact. When and how often is 
> an implementation detail. Evacuating collectors use a top of heap pointer 
> (and other more complex tricks) that act as a free list. Since they 
> evacuate to another pool, the data will naturally be compacted. Thus no 
> compaction will be necessary. As you can imagine, the time/space cost 
> models for these two general approaches are quite different. 
>
> In OpenJDK we have both generational and regional collectors. The 
> generational collector divides memory into 4 memory pools named Eden, 
> Survivor 0, Survivor 1, and Tenured space. The first three are grouped into 
> young and Tenured into old space. In young we can use mark-evacuation where 
> as in old we are forced to use mark-sweep-compact. I’m going to claim (I’m 
> sure Gil will want to chime in here) that G1, C4, Balance and Shenandoah 
> are regional collectors in that they divide heap into a large number of 
> memory pools (regions).



Agreed. These are all regional Mark-Compact evacuating collectors. But G1, 
Balanced and Shenandoah are olden-only regional collectors (or single 
generation in the current Shenandoah case), while C4 uses both a regional 
Mark-compact newgen collector and a regional Mark-Compact oldgen collector. 
As such, it gets all the nice-for-your-electron-budget benefits of 
generational collection along with all the nice benefits of regional 
collection.
 

> G1 aims for 2048 memory pools. In all 4 cases, the collectors are very 
> very different variants of mark-evacuation each with their own quirks. The 
> advantage of using regions is that you only need an evacuating collector as 
> you don’t end up with a terminal pool (tenured) where you are forced to use 
> an in-place collector. If you are careful in how you select regions to be 
> recovered, you can realize some nice wins over generational collectors. 
> However, these collectors tend to be far more complex and harder on your 
> electron budget than generational collectors are.


The harder-on-electron-budget thing is not algorithmically inherent to 
evacuating regional collectors. It usually has to do with other algorithmic 
choices that may be made in collectors that happen to be regional. E.g. 
some region-based incremental STW compactors (G1, Balanced) generally try 
to compact a small increment of the heap and fixup a hopefully-limited set 
of regions that refer to the compacted set, all in a single STW pause. 
These algorithms tend to rely on cross-region remembered-set tracking which 
involves some interesting electron budget implications, but more 
importantly, because the same regions will be involved in multiple fixup 
passes, the resulting N^2 complexity potential can also cause the electron 
budget to significantly inflate. Evacuating regional collector algorithms 
that do not rely on cross-region remembered sets, and do not attempt to 
perform compaction is short incremental STW pauses (e.g. Pauseless, 
Compressor, C4, Shenandoah, ...) do not incur these specific extra costs.


 

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: Confusion regarding 'mark-sweep-compact' naming

Reply via email to