On Sun, Nov 24, 2013 at 1:45 AM, Ben Kloosterman <[email protected]> wrote:

> People looking at Boehm / Mark Sweep  and uncooperative environments
> should definetly look at non moving RC-Immix style collector.
>

Unfortunately, the false retention rate for conservative collectors is
astonishingly high, and rises rapidly as a function of heap size (I speak
from *bitter* experience). This can easily turn that 10% stuck object
number into 30%, which is a different set of conditions entirely.

Non-relocation and conservative retention are two different issues.
Conservative collectors, of necessity, can't relocate anything that has
been conservatively marked, but it's the fact that pointer identification
is conservative that leads to the high false retention rate.


> The worst case is not a unsustained growth because new blocks are not
> needed you can reallocate holes at decreasing mutator efficiency until all
> space is  used.  We do have to be concerned about the fragmentation cost.
>

I've been wondering about that. My version of this question is: "How many
islands can a 'free' block have and still reasonably be treated as free?"

If we ignore the question of nursery allocation, then I think free blocks
have two purposes:

   1. They provide for mid-sized object allocation, and
   2. They avoid the need to leave holes in the current TLA's recycled
   block.

I suspect it might work out to define a free block as "a block having at
least *k* free regions suitable for mid-sized object allocation".
Allocating from the "free" block might now lead to unused subranges if
allocation requests come in some unfortunate order. When you fall off the
end of the free block, you stick that block in a queue, from which it will
be taken the next time the mutator's TLA requires a recycled block. In this
way the holes you left behind get filled in.


> For fragmentation the best case is every object is a multiple of the line
> size  in which case there is zero fragmentation , the worst case is where
> you have lots of minimum sized objects ( say 32 bytes ) and exactly 1
> survives for a 75 % fragmentation cost.  Also its tunable a smaller line
> size has lower fragmention  at a slight cost to mutator performance. I
> think the cost is about 10-15%  which could even drop by half for say a 64
> byte line size if memory is crucial eg on a mobile device run a
> SpaceEfficient allocator . Either way its not huge as many allocators have
> some fragmentation cost and in some cases 4 word headers.  One thing to
> note is nursery fragmentation is a bigger issue  due to the small size of
> most short lived objects.
>

That's internal fragmentation, and it's not what worries me. What worries
me is *external* fragmentation, which comes from free ranges of lines that
don't get used because of bad luck in allocation sizes and orderings.


> The much bigger issue is Rc-Immix has poor performance ( compared to
> cutting edge) for  heaps up to 1.5* and it isnt great until about 2 * heap
> . The reasons is  not so much related to copying but I will restate for
> emphasis.
> 1) Either copying cost is not as efficient as mark sweep or  no copying
> has poor Gen 1 locality ( the diffirence  between these is low)  ( for
> 1-1.5* heap performance )
> 2) More importantly because of the greater space availability  there is
> far less work especially ref counting  so cycles are more dispersed.
> Concurrency can help here depending on barrier cost .. but changes the
> whole nature of the collector .
>
> The fact most GC heaps require 1.5+* memory for really good performance is
> a bigger issue than 10% fragmentation.
>

No no. Those 10% islands will likely lead to 15% or 20% fragmentation.
Dropping the heap requirement by 20% is the difference between a 1.5* heap
at a given performance level and a 1.2* heap at the same performance.
That's why this matters so much.

I believe the best intial solutions is non relocating Rc-Immix   instrument
> it , optomize it which is much easier to do when you dont need to worry
> about barrier costs / mutator non allocating behaviour .
>

Yes. Associated with a precise marking capability.

But in parallel with this we want to do a lame-ass relocating
implementation, because we want to keep ourselves honest at the runtime and
language semantics layers.

Though I'd still do a copying nursery in gen1.



Also, the "1.5* heap" metric is a very peculiar metric, because nobody runs
comparable examinations of manually allocated heaps.


shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to