On Mon, Jul 15, 2013 at 6:39 PM, David Jeske <[email protected]> wrote:

> On Mon, Jul 15, 2013 at 5:34 PM, Jonathan S. Shapiro <[email protected]>wrote:
>
>> On Mon, Jul 15, 2013 at 3:09 PM, David Jeske <[email protected]> wrote:
>>
>>> GC tracing work is proportional to pointer-count and program-duration.
>>> There are certain programs for which that model can not equal C performance.
>>>
>>
>> That just isn't clear. The problem is that work is roughly identical to
>> the alternative allocation and deallocation work in C/C++.
>>
>
> Whoa there! stop the presses. Now this is an interesting debate! Which (I
> believe will be short lived.)
>
> 1) Consider two programs, one C/malloc/free, one "the most ideal
> zero-pause multi-generational copying GC we can imagine".
>
> 2) The work of the program will accumulate and discard a constant amount
> of random data from the long-lived data-set as it runs. If you prefer, it
> can accumulate and discard a constant fraction of the long-lived dataset
> instead. (there are many such interesting programs, including 3d games,
> databases, data-analytics front-ends, etc. etc.)
>
> 3) Now, increase that long-lived data-set until it's size nears the
> theoretical infinity -- or the practical inconveniently large.
>
> How can a copying-GC system could come anywhere close in overhead (memory,
> throughput, memory-management-cpu) to unsafe malloc/free for this case?
>
> As dataset size grows, the GC version consumes increasing amounts of CPU
> and memory bandwidth walking the huge increasing dataset size trying to
> find the discards. Every N discard/allocation cycles, it will have to scan
> the entirety of memory. Even ARC will beat copying-collection in this
> case.. which is why that's basically what the hand-coded C program does.
>
> What facts are we seeing differently that leads you to your statement
> above?
>


1. Your assumption about how often the tenured space has to be fully walked.

2. Your assumption about the subset of tenured space that must be walked to
collect earlier generations.

3, Your assumption that no idle cycles are available in which to do the
walk.


The time spent scanning is largely irrelevant. The only interesting metric
is the percentage of total time in which the mutator is executing,
discounting the time spend in malloc, free, and GC-alloc.

My guess is that the C4 collector will outperform malloc/free in your
example, with *no* performance degradation in the mutator.


Setting that aside, kicking straw dogs is not a good use of time. Unless
there are meaningful programs that actually exhibit this behavior, it's an
exercise in intellectual masturbation. One can, of course, create
pathological cases for *any* allocator, including fully manual storage
management.


Jonathan
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to