Re: [rust-dev] Interesting paper on RC vs GC

Matthieu Monrocq Tue, 01 May 2012 04:07:36 -0700

On Tue, May 1, 2012 at 9:53 AM, Florian Weimer <[email protected]> wrote:


> * Sebastian Sylvan:
>
> > R. Shahriyar, S. M. Blackburn, and D. Frampton, "Down for the Count?
> > Getting Reference Counting Back in the Ring," in Proceedings of the
> > Eleventh ACM SIGPLAN International Symposium on Memory Management,
> > ISMM ‘12, Beijing, China, June 15-16, 2012.
> >
> > http://users.cecs.anu.edu.au/~steveb/downloads/pdf/rc-ismm-2012.pdf
>
> I think they give up deterministic finalization, which would make this
> approach not suitable to Rust.
>

What do you mean exactly by deterministic finalization ? (I think I
understand but my answer may be off)

I think there are two different scenarios for determinism:
- a stack-allocated object, or an object referred to by a unique reference
(~) can easily be deterministically cleaned up (end of the scope)
- an object referred to by a shared reference can be deterministically
cleaned up only with accurate reference counting *and* in the absence of
cycles

AFAIK the latter (absence of cycles) is not merely theoretic, for example
in Python, there is no guarantee that the __del__ method will be called for
an object in the situation of cycles because the GC has to "break" the
cycle at some point and, I think, "sacrifices" at least one object to do
so. The article touches on statistic analysis to prove that an object,
given its type, cannot be part of a cycle (ie, it cannot maintain shared
references to other instances of its own type) however this can get
impossible with ifaces: they form a compilation firewall.

Still, the one solution to get deterministic finalization would be to
forbid shared references to objects with an explicit destructor action if
it cannot be statically proven they will not be part of a cycle. Or at
least give a warning in this (hairy) case. I don't know how frequent it'll
get.

---

Moving on, the article has been written in the context of Java, which does
not give any thought to immutability and has a much weaker type system.

Immutability has a profound effect on references: immutable objects can
only form a DAG! This alleviates the *one* big drawback of reference
counting: no cycles of references means that there is no need for fallback
strategies. Of course, as the article outlines, it might be worth having a
tracing collection over young objects anyway.

On the other hand, I do not know how well immutability deal with mutable
counters embedded in the object. One of the selling point of immutability
is referential transparency and the ease of sharing between concurrent (and
thus possibly parallel) tasks. However this mutable counter means that at
the bit-level the memory is not frozen, making sharing more difficult. I am
not immersed enough in Rust: does Rust has a header for objects like Java
does ? (I think not...)

Furthermore, Rust's give emphasis to several pointer types, and the fact
that `&` has no ownership associated means that it forms a natural barrier
to reference counting. A function manipulating a `&str` does not have to
worry about its ownership, it knows the object will outlive its call. This
effectively removes lots of inc/dec operations. In Java this can only be
inferred with escape analysis, and I don't know how much they rely on it or
can actually infer with it.

As a consequence, I am unsure of the impact this article should have on
Rust's GC design. The implementation strategies presented are very clear
and the advantages/drawbacks clearly outlined, which is great (big thank
you to the OP); however the benchmarks and conclusions might be a tad
Java-centric and not really apply to Rust's more advanced type system.

---

Finally I think it might be worth considering having two distinct GC
strategies:
- one for immutable objects (that only references other immutable objects)
- one for the rest (mutable objects with potential cycles)

I see no reason to try and employ the same strategy for such widely
different profiles other than the potential saving in term of coding
effort. But then trying to cram every single usecase in a "generic"
algorithm while keeping it efficient seems quite difficult too, whereas
having several specialized mechanisms might make for much clearer code.

One idea I have toyed with for my own was to have simple stubs: design a
clear API for GC, with two (or more) sets of functions for example here,
and call those functions instead of inlining their effect (in the IR). By
providing the functions definitions externally (but inlining them in each
IR module) this makes it easy to switch back and forth between various
implementations whilst still retaining the efficiency of the LLVM backend
to inline/optimize the calls.

This means one can actually *test* the strategies, and perhaps even let the
user *choose* which one better suits her needs. Of course coherency at the
executable level might be necessary.

-- Matthieu

_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Interesting paper on RC vs GC

Reply via email to