Hi Gil, thanks for your detailed answer.
So mark-sweep-compact implementation really do exist. I'll need to update a set of flashcards. On Wednesday, August 14, 2019 at 12:30:10 PM UTC+3, Gil Tene wrote: > > > On Aug 13, 2019, at 11:34 PM, Peter Veentjer <pe...@hazelcast.com > <javascript:>> wrote: > > Thanks for your answer Aleksey, > > comments inline. > > On Tuesday, August 6, 2019 at 12:41:28 PM UTC+3, Aleksey Shipilev wrote: >> >> On 8/6/19 7:38 AM, Peter Veentjer wrote: >> > There is still some confusion on my side. This time it is regarding the >> algorithmic complexity of >> > mark-sweep-compact. >> > >> > The complexity of the mark phase is linear to the live set. >> > The complexity of the sweep phase is linear to the heap size. >> > The complexity of the compact phase is linear to the live set. >> >> Well... >> >> The multiplicative factors in either case might make one linearity >> perform better than the other on >> wildly different heap sizes and live sets. The adversarial example of >> this is N large byte arrays of >> size M. Compact might have to copy N*M bytes (especially when doing >> sliding compaction), and sweep >> might only need to visit N locations to check on object metadata. >> >> There are second-order heap size effects for both mark and compact. Take >> two heaps with the same >> live set, one very dense and one very sparse, and the mere fact you have >> to walk memory far away >> adds up the secondary effect of heap size. It would not linger for >> compacting collectors, though, as >> first few compaction cycles would collapse the heap to denser variant. >> >> That is all to say that algorithmic complexity is a seriously bad way to >> reason about (GC) performance. >> > > I understand. I created a bunch of flashcard to get a better conceptual > understanding of the different gc algorithms; > I'll add some additional flashcard to refine my understanding. > > >> >> > Why would you ever want to include the a sweep phase because the >> complexity of mark-sweep-compact is >> > now linear to the heap size; a lot worse compared to 'linear to the >> live set'. Get rid of the sweep >> > and complexity goes down to 'linear to the live set'. What is the added >> value of such an inefficient >> > GC implementation? >> >> I think m-s-c nomenclature is confusing, as the actual implementations >> are doing either m-s or m-c >> during their individual phases. > > > Exactly. And this was the source of my confusion and caused by an abuse of > names. > > > There are very real and popular collectors that do Mark, Sweep, *and* > Compact, e.g the OpenJDK oldgen collectors in both ParallelGC and SerialGC. > There are very real and popular collectors that do only Mark and Sweep, > e.g. the non-compacting oldgen of the CMS collector (as long as it does not > fall back to the STW compacting fullGC). And there are very real and > popular collectors that do only Mark and Compact (no sweep), e.g. C4, > Pauseless, Compressor, and recently ZGC. > > For an easy to follow code example of Mark, Sweep, and Compact, look at > the (ahem...) psMarkSweep.c code for the compacting OpenJDK ParallelGC > oldgen implementation, which does 4 separate passes ( > > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/7576bbd5a03c/src/share/vm/gc_implementation/parallelScavenge/psMarkSweep.cpp#l201 > ): > > 1. A tracing, pointer chasing pass that marks the live set. > Line 514: // Recursively traverse all live objects and mark them > > 2. A sweeping pass that computes the new object addresses > Line 579: // Now all live objects are marked, compute the new object > addresses. > (new addresses are computed to ”shift everything to one side”, > which, when actually done in pass 4, will compact the heap in place) > > 3. A sweeping pass to adjust references to point to the new object > addresses: > Line 598: // Adjust the pointers to reflect the new locations > (this is sometimes referred to as a fixup pass) > > 4. A compact pass that moves the objects to their previously chosen target > locations. > Line 645: // All pointers are now adjusted, move objects accordingly > > This is a classic Mark, Sweep, and Compact collector. Pass 1 only visits > live objects, while passes 2-4 linearly traverse the entire heap, live or > dead), going forward one object at a time, and skipping over the dead > objects one by one. For at least the first sweeping pass, when you arrive > at a dead object’s header, you still need to follow its type indicator to > determine the object size in order to know how far to skip ahead to the > next object. So you visit and follow pointer information in each and every > object header in the heap, dead or alive. > > In contrast, a pure mark-compact can be built to never do a “move from one > object to another, including the dead ones” pass. E.g. an evacuating mark > compact can use the mark pass to produce an efficient way to move from one > live object to the next when evacuating and doing pointer fixup. > > There are various schemes for doing this efficiently without ever visiting > a dead object. E.g. In C4 we keep live marks in a bit vector with one bit > per 64 bit word of heap (we mark in this bit vector, not in the object > headers). When it comes time to evacuate e.g. a 2MB region, we just scan > through the associated 32KB of memory that contains the live bits for that > 2MB region, using a find-first-bit operation (which on many processors can > be done efficiently with direct bit scan, learding zeros, or trailing zeros > instructions), stopping only on set live bits, and visiting only the live > objects they correspond to. The evacuating compact never looks at a dead > object header or follows its klass indicator to figure out its size. > > An evacuating compactor will typically fix object references (to point to > their correct location) after copying all the object bodies. Since all > references in the heap may potentially need fixing, all reference fields in > all live objects that could possibly point to any relocated objects will > need to be visited at some point to do this, and each non-null reference > visited will need to potentially follow a forwarding pointer somewhere to > find the object’s actively target address. > > People often point to this fixup pass as a “sweep” of sorts, as it is > often performed linearly. In collectors where this pass is performed on a > compacted set of objects, such a “sweep” will only visit live objects, so it > can > still be dramatically less work than visiting all dead object headers and > following their Klass pointers. > > However, with the invention of “Quick Release” (first published in the > Pauseless and C4 papers, but also used since by Compressor and ZGC, see > explanation in section 2.5 of the C4 paper [1]), at least some concurrent > evacuating regional collectors are able to avoid having a separate fixup > pass altogether. Storing forwarding pointers outside of evacuated regions > (as opposed to the traditional way of storing them in or near evacuated > object headers) enables the recycling of an evacuated region’s memory for > new allocations and for hand-over-hand compaction without stomping the > forwarding pointers. This not only allows the compaction of the entire heap > with no “target” memory space needed, it enables lazy-fixup, since fixup is > no longer required to occur before memory can be completely recycled. > > With quick release and lazy fixup (supported by a loaded reference value > barrier like LVB), fixup can be delayed as far as the next GC cycle. Which > means fixup work can be performed in the same pointer chasing tracing pass > that would perform the marking for the next GC cycle, with some pretty big > savings (eliminating an entire OSs in all objects in the heap). > > This means that a modern regional collector that supports concurrent > evacuation can do a pure mark compact (no sweep), and can achieve that > with exactly two passes: a single mark+fixup pass (mark for this GC cycle, > fixup fur previous), and a single evacuating relocate (but no fixup) pass. > Pauseless, C4, Compressor, and now ZGC all do exactly that. > > Note that the collector *has* to support concurrent compaction (or at the > very least concurrent fixup) to get this dramatic improvement in > efficiency, as without it the application could not be allowed to run > between the compacting evacuation and the eventual fixup. This is a > practical example of how a concurrent compacting collector can actually be > built to be more CPU efficient than a stop-the-world compacting collector, > and a busting of the myth that concurrency always comes at a cost. Because > of Quick-Release, a concurrent collector can fundamentally beat a stop > the world collector in efficiency. > > [1] C4: The Continuously Concurrent Compacting Collector. ISMM’11 > https://www.azul.com/files/c4_paper_acm1.pdf > > > > >> The benefit of doing either is well covered. >> >> The hard naming question is how to properly label the collector where >> different phases employ >> different algos. For example, in CMS the young collection is copying, old >> collection is mark-sweep, >> fall-back full GC is mark-compact. How would you label CMS then? > > > It would be easier/simpler to label the young and old collector > independently; but I understand your point. > > > >> I believe marking it m-s-c is >> borderline tolerable, until users get confused thinking both sweep and >> compact happen in the same phase. >> > >> Also with implementations that manage backing storage in fine-grained >> chunks (for example, free >> lists), "sweep" might mean dealing with them on per-dead-object basis. >> For example, this might mean >> preparing the storage for subsequent compaction by "sweeping" out old >> objects before compaction >> kicks in, which requires free space. Practical implementations I know opt >> to instead reconstruct the >> backing storage metadata after massive compaction moves, though. >> >> My $0.02c, anyway. >> >> -Aleksey >> >> -- > You received this message because you are subscribed to a topic in the > Google Groups "mechanical-sympathy" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/mechanical-sympathy/Kk0rQij7q8I/unsubscribe > . > To unsubscribe from this group and all its topics, send an email to > mechanical-sympathy+unsubscr...@googlegroups.com <javascript:>. > To view this discussion on the web, visit > https://groups.google.com/d/msgid/mechanical-sympathy/2a29baf3-2027-4eb6-a7f1-4737397f9f0f%40googlegroups.com > > <https://groups.google.com/d/msgid/mechanical-sympathy/2a29baf3-2027-4eb6-a7f1-4737397f9f0f%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > -- You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group. To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsubscr...@googlegroups.com. To view this discussion on the web, visit https://groups.google.com/d/msgid/mechanical-sympathy/ad3bf3b1-4a5d-42c1-bee9-371537ef9157%40googlegroups.com.