On Wednesday, 14 September 2016 at 13:28:45 UTC, finalpatch wrote:
On Tuesday, 13 September 2016 at 18:24:26 UTC, deadalnix wrote:
No you don't, as how often the GC kicks in depend of the rate at which you produce garbage, which is going to be very low with an hybrid approach.

This is simply not true.

Assume in a pure GC program the GC heap can grow up to X Mb before a collection cycle happens, which has to scan X Mb of memory.


No it has to scan the live set. If we assume we are ready to accept a 2X overhead in the GC heap in that program, the collection cycle needs to scan X/2 Mb of memory. We are for a bad start here.

Now let's say we have a hybrid program that uses 0.5X Mb of RCed memory and 0.5X Mb of GC memory so the total memory consumption is still X Mb. When the GC heap reaches 0.5X Mb, it has to scan both RC and GC memory.


Your assumption that there are 2 heap is bogus, your 2 programs have different live sets (A has 500kb and B 750ko of live sets). In addition, why the fuck is your RC system only able to reclaim 50% of the garbage emitted ? Even with such stupids humber, you end up with program A able to manage 500kb with 100% overhead, and program B able to manage 750ko with 33% overhead, which completely proves my point: the hybrid approach is far superior.

Now let's get an appropriate model of how thing work in the real world. Let's assume we have a program that emit 1Mb of garbage per second, has a live set of 1Mb and we assume we can accept a 2X memory overhead for the GC.

With the pure GC approach, we emit 1Mb of garbage per second on top of our live set of 1Mb, so we need one collection cycle per second. This collection cycle has to scan the living set, namely 1Mb of data.

With the hybrid approach, we still emit 1Mb of garbage per second, but the RC system can reclaim 90% of it. We end up with a rate of garbage for the GC to collect of 100ko per second. If we allow the same memory overhead, we end up with a collection cycle every 10s. The live set still has the same size, so the GC still has to scan 1Mb of data. Therefore, we effectively divided by 10 the resource we needed to allocate to the GC.

It's quite obvious that the time(t) it takes for program 1 to produce X Mb of garbage is the same as program 2 to produce 0.5X Mb of garbage, and after time t, both program have to scan X Mb of memory. However program 2 also has to pay the cost of reference counting on top of that.


Rule of thumb, when someone start by "it's obvious that" you can be sure that 99% of the time, what follows is confirmation bias rather than anything cogent. I think we've established this is the case here.

Reply via email to