On Sun, Nov 24, 2013 at 1:45 AM, Ben Kloosterman <[email protected]> wrote:
> People looking at Boehm / Mark Sweep and uncooperative environments > should definetly look at non moving RC-Immix style collector. > Unfortunately, the false retention rate for conservative collectors is astonishingly high, and rises rapidly as a function of heap size (I speak from *bitter* experience). This can easily turn that 10% stuck object number into 30%, which is a different set of conditions entirely. Non-relocation and conservative retention are two different issues. Conservative collectors, of necessity, can't relocate anything that has been conservatively marked, but it's the fact that pointer identification is conservative that leads to the high false retention rate. > The worst case is not a unsustained growth because new blocks are not > needed you can reallocate holes at decreasing mutator efficiency until all > space is used. We do have to be concerned about the fragmentation cost. > I've been wondering about that. My version of this question is: "How many islands can a 'free' block have and still reasonably be treated as free?" If we ignore the question of nursery allocation, then I think free blocks have two purposes: 1. They provide for mid-sized object allocation, and 2. They avoid the need to leave holes in the current TLA's recycled block. I suspect it might work out to define a free block as "a block having at least *k* free regions suitable for mid-sized object allocation". Allocating from the "free" block might now lead to unused subranges if allocation requests come in some unfortunate order. When you fall off the end of the free block, you stick that block in a queue, from which it will be taken the next time the mutator's TLA requires a recycled block. In this way the holes you left behind get filled in. > For fragmentation the best case is every object is a multiple of the line > size in which case there is zero fragmentation , the worst case is where > you have lots of minimum sized objects ( say 32 bytes ) and exactly 1 > survives for a 75 % fragmentation cost. Also its tunable a smaller line > size has lower fragmention at a slight cost to mutator performance. I > think the cost is about 10-15% which could even drop by half for say a 64 > byte line size if memory is crucial eg on a mobile device run a > SpaceEfficient allocator . Either way its not huge as many allocators have > some fragmentation cost and in some cases 4 word headers. One thing to > note is nursery fragmentation is a bigger issue due to the small size of > most short lived objects. > That's internal fragmentation, and it's not what worries me. What worries me is *external* fragmentation, which comes from free ranges of lines that don't get used because of bad luck in allocation sizes and orderings. > The much bigger issue is Rc-Immix has poor performance ( compared to > cutting edge) for heaps up to 1.5* and it isnt great until about 2 * heap > . The reasons is not so much related to copying but I will restate for > emphasis. > 1) Either copying cost is not as efficient as mark sweep or no copying > has poor Gen 1 locality ( the diffirence between these is low) ( for > 1-1.5* heap performance ) > 2) More importantly because of the greater space availability there is > far less work especially ref counting so cycles are more dispersed. > Concurrency can help here depending on barrier cost .. but changes the > whole nature of the collector . > > The fact most GC heaps require 1.5+* memory for really good performance is > a bigger issue than 10% fragmentation. > No no. Those 10% islands will likely lead to 15% or 20% fragmentation. Dropping the heap requirement by 20% is the difference between a 1.5* heap at a given performance level and a 1.2* heap at the same performance. That's why this matters so much. I believe the best intial solutions is non relocating Rc-Immix instrument > it , optomize it which is much easier to do when you dont need to worry > about barrier costs / mutator non allocating behaviour . > Yes. Associated with a precise marking capability. But in parallel with this we want to do a lame-ass relocating implementation, because we want to keep ourselves honest at the runtime and language semantics layers. Though I'd still do a copying nursery in gen1. Also, the "1.5* heap" metric is a very peculiar metric, because nobody runs comparable examinations of manually allocated heaps. shap
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
