Hi Gil,

thanks for your detailed answer.

So mark-sweep-compact implementation really do exist. I'll need to update a 
set of flashcards.

On Wednesday, August 14, 2019 at 12:30:10 PM UTC+3, Gil Tene wrote:
>
>
> On Aug 13, 2019, at 11:34 PM, Peter Veentjer <pe...@hazelcast.com 
> <javascript:>> wrote:
>
> Thanks for your answer Aleksey, 
>
> comments inline.
>
> On Tuesday, August 6, 2019 at 12:41:28 PM UTC+3, Aleksey Shipilev wrote: 
>>
>> On 8/6/19 7:38 AM, Peter Veentjer wrote: 
>> > There is still some confusion on my side. This time it is regarding the 
>> algorithmic complexity of 
>> > mark-sweep-compact. 
>> > 
>> > The complexity of the mark phase is linear to the live set. 
>> > The complexity of the sweep phase is linear to the heap size. 
>> > The complexity of the compact phase is linear to the live set. 
>>
>> Well... 
>>
>> The multiplicative factors in either case might make one linearity 
>> perform better than the other on 
>> wildly different heap sizes and live sets. The adversarial example of 
>> this is N large byte arrays of 
>> size M. Compact might have to copy N*M bytes (especially when doing 
>> sliding compaction), and sweep 
>> might only need to visit N locations to check on object metadata. 
>>
>> There are second-order heap size effects for both mark and compact. Take 
>> two heaps with the same 
>> live set, one very dense and one very sparse, and the mere fact you have 
>> to walk memory far away 
>> adds up the secondary effect of heap size. It would not linger for 
>> compacting collectors, though, as 
>> first few compaction cycles would collapse the heap to denser variant. 
>>
>> That is all to say that algorithmic complexity is a seriously bad way to 
>> reason about (GC) performance. 
>>
>
> I understand. I created a bunch of flashcard to get a better conceptual 
> understanding of the different gc algorithms;
> I'll add some additional flashcard to refine my understanding.
>  
>
>>
>> > Why would you ever want to include the a sweep phase because the 
>> complexity of mark-sweep-compact is 
>> > now linear to the heap size; a lot worse compared to 'linear to the 
>> live set'. Get rid of the sweep 
>> > and complexity goes down to 'linear to the live set'. What is the added 
>> value of such an inefficient 
>> > GC implementation? 
>>
>> I think m-s-c nomenclature is confusing, as the actual implementations 
>> are doing either m-s or m-c 
>> during their individual phases. 
>
>
> Exactly. And this was the source of my confusion and caused by an abuse of 
> names.
>
>
> There are very real and popular collectors that do Mark, Sweep, *and* 
> Compact, e.g the OpenJDK oldgen collectors in both ParallelGC and SerialGC. 
> There are very real and popular collectors that do only Mark and Sweep, 
> e.g. the non-compacting oldgen of the CMS collector (as long as it does not 
> fall back to the STW compacting fullGC). And there are very real and 
> popular collectors that do only Mark and Compact (no sweep), e.g. C4, 
> Pauseless, Compressor, and recently ZGC.
>
> For an easy to follow code example of Mark, Sweep, and Compact, look at 
> the (ahem...) psMarkSweep.c code for the compacting OpenJDK ParallelGC 
> oldgen implementation, which does 4 separate passes (
>
> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/7576bbd5a03c/src/share/vm/gc_implementation/parallelScavenge/psMarkSweep.cpp#l201
>  ):
>
> 1. A  tracing, pointer chasing pass that marks the live set.
>     Line 514: // Recursively traverse all live objects and mark them
>
> 2. A sweeping pass that computes the new object addresses 
>     Line 579: // Now all live objects are marked, compute the new object 
> addresses.
>     (new addresses are computed to ”shift everything to one side”,
>      which, when actually done in pass 4, will compact the heap in place)
>
> 3. A sweeping pass to adjust references to point to the new object 
> addresses:
>     Line 598: // Adjust the pointers to reflect the new locations
>     (this is sometimes referred to as a fixup pass)
>
> 4. A compact pass that moves the objects to their previously chosen target 
> locations.
>     Line 645: // All pointers are now adjusted, move objects accordingly
>
> This is a classic Mark, Sweep, and Compact collector. Pass 1 only visits 
>  live objects, while passes 2-4 linearly traverse the entire heap, live or 
> dead), going forward one object at a time, and skipping over the dead 
> objects one by one. For at least the first sweeping pass, when you arrive 
> at a dead object’s header, you still need to follow its type indicator to 
> determine the object size in order to know how far to skip ahead to the 
> next object. So you visit and follow pointer information in each and every 
> object header in the heap, dead or alive.
>
> In contrast, a pure mark-compact can be built to never do a “move from one 
> object to another, including the dead ones” pass. E.g. an evacuating mark 
> compact can use the mark pass to produce an efficient way to move from one 
> live object to the next when evacuating and doing pointer fixup.
>
> There are various schemes for doing this efficiently without ever visiting 
> a dead object. E.g. In C4 we keep live marks in a bit vector with one bit 
> per 64 bit word of heap (we mark in this bit vector, not in the object 
> headers). When it comes time to evacuate e.g. a 2MB region, we just scan 
> through the associated 32KB of memory that contains the live bits for that 
> 2MB region, using a find-first-bit operation (which on many processors can 
> be done efficiently with direct bit scan, learding zeros, or trailing zeros 
> instructions), stopping only on set live bits, and visiting only the live 
> objects they correspond to. The evacuating compact never looks at a dead 
> object header or follows its klass indicator to figure out its size.
>
> An evacuating compactor will typically fix object references (to point to 
> their correct location) after copying all the object bodies. Since all 
> references in the heap may potentially need fixing, all reference fields in 
> all live objects that could possibly point to any relocated objects will 
> need to be visited at some point to do this, and each non-null reference 
> visited will need to potentially follow a forwarding pointer somewhere to 
> find the object’s actively target address.
>
> People often point to this fixup pass as a “sweep” of sorts, as it is 
> often performed linearly. In collectors where this pass is performed on a 
> compacted set of objects, such a “sweep” will only visit live objects, so it 
> can 
> still be dramatically less work than visiting all dead object headers and 
> following their Klass pointers. 
>
> However, with the invention of “Quick Release” (first published in the 
> Pauseless and C4 papers, but also used since by Compressor and ZGC, see 
> explanation in section 2.5 of the C4 paper [1]), at least some concurrent 
> evacuating regional collectors are able to avoid having a separate fixup 
> pass altogether. Storing forwarding pointers outside of evacuated regions 
> (as opposed to the traditional way of storing them in or near evacuated 
> object headers) enables the recycling of an evacuated region’s memory for 
> new allocations and for hand-over-hand compaction without stomping the 
> forwarding pointers. This not only allows the compaction of the entire heap 
> with no “target” memory space needed, it enables lazy-fixup, since fixup is 
> no longer required to occur before memory can be completely recycled.
>
> With quick release and lazy fixup (supported by a loaded reference value 
> barrier like LVB), fixup can be delayed as far as the next GC cycle. Which 
> means fixup work can be performed in the same pointer chasing tracing pass 
> that would perform the marking for the next GC cycle, with some pretty big 
> savings (eliminating an entire OSs in all objects in the heap).
>
> This means that a modern regional collector that supports concurrent 
> evacuation can do a pure mark compact (no sweep), and can achieve that 
> with exactly two passes: a single mark+fixup pass (mark for this GC cycle, 
> fixup fur previous), and a single evacuating relocate (but no fixup) pass. 
> Pauseless, C4, Compressor, and now ZGC all do exactly that.
>
> Note that the collector *has* to support concurrent compaction (or at the 
> very least concurrent fixup) to get this dramatic improvement in 
> efficiency, as without it the application could not be allowed to run 
> between the compacting evacuation and the eventual fixup. This is a 
> practical example of how a concurrent compacting collector can actually be 
> built to be more CPU efficient than a stop-the-world compacting collector, 
> and a busting of the myth that concurrency always comes at a cost. Because 
> of Quick-Release, a concurrent collector can fundamentally beat a stop 
> the world collector in efficiency.
>
> [1] C4: The Continuously Concurrent Compacting Collector. ISMM’11 
> https://www.azul.com/files/c4_paper_acm1.pdf 
>
>
>  
>
>> The benefit of doing either is well covered. 
>>
>> The hard naming question is how to properly label the collector where 
>> different phases employ 
>> different algos. For example, in CMS the young collection is copying, old 
>> collection is mark-sweep, 
>> fall-back full GC is mark-compact. How would you label CMS then? 
>
>
> It would be easier/simpler to label the young and old collector 
> independently; but I understand your point.
>
>  
>
>> I believe marking it m-s-c is 
>> borderline tolerable, until users get confused thinking both sweep and 
>> compact happen in the same phase. 
>>
>
>> Also with implementations that manage backing storage in fine-grained 
>> chunks (for example, free 
>> lists), "sweep" might mean dealing with them on per-dead-object basis. 
>> For example, this might mean 
>>  preparing the storage for subsequent compaction by "sweeping" out old 
>> objects before compaction 
>> kicks in, which requires free space. Practical implementations I know opt 
>> to instead reconstruct the 
>> backing storage metadata after massive compaction moves, though. 
>>
>> My $0.02c, anyway. 
>>
>> -Aleksey 
>>
>> -- 
> You received this message because you are subscribed to a topic in the 
> Google Groups "mechanical-sympathy" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/mechanical-sympathy/Kk0rQij7q8I/unsubscribe
> .
> To unsubscribe from this group and all its topics, send an email to 
> mechanical-sympathy+unsubscr...@googlegroups.com <javascript:>.
> To view this discussion on the web, visit 
> https://groups.google.com/d/msgid/mechanical-sympathy/2a29baf3-2027-4eb6-a7f1-4737397f9f0f%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/mechanical-sympathy/2a29baf3-2027-4eb6-a7f1-4737397f9f0f%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/mechanical-sympathy/ad3bf3b1-4a5d-42c1-bee9-371537ef9157%40googlegroups.com.

Reply via email to