On Fri, Jun 7, 2013 at 12:03 AM, Bill Myers <[email protected]> wrote:
> Reference counting is generally more desirable than garbage collection,
> since it is simple and deterministic, and avoids scanning the whole heap of
> the program, which causes pauses, destroys caches, prevents effective
> swapping and requires to tolerate increasing memory usage by a
> multiplicative factor to allow GC to be amortized to be linear in the number
> of dead objects in the general case.
Reference counting is faster than a good garbage collector with large,
long-lived objects, but there are also many cases when it's slower.
Small objects are greatly increased in size by the reference count and
in Rust for mutable reference counted types, there's also the freeze
state. On x86_64, a 8-byte allocation becomes 24 bytes under a minimal
mutable reference counted type (value, count, freeze) and then rounded
by jemalloc/tcmalloc/glibc to a 32 byte allocation.
If you include weak pointers, you need a second allocation with the
pointer, strong reference count and weak reference count (24 bytes,
rounding to 32 bytes) and the main allocation will have the actual
8-byte object and the freeze status (so 16 bytes).
If you pass around the objects by-value frequently, the reference
counting will greatly outweigh the cost of garbage collection cycles.
A garbage collector can compact the heap, since mutable boxes keep
track of whether they are borrowed from for freezing. A good allocator
like jemalloc can avoid heap fragmentation from small objects but not
large ones like vectors.
> However, there seems to be a good solution: the following rule extends both
> the "no mutations" and "no rc pointers" cases while allowing the above, and
> allows creating rc pointers to any type at the expense of forbidding some
> mutations:
> *** If there is a cycle of pointers (of any kind, including raw, owned and
> borrowed, excluding raw pointers annotated as weak references) or enums
> containing at least one reference-counted pointer, forbid mutating all those
> pointers (even with an &mut) unless the object being assigned is the result
> of an expression that doesn't include pointers to previously-created objects
> (and thus cannot point to the pointer), or the value containing the pointer
> is on the stack or owned by a stack slot (and thus is not in an rc box or
> owned by it, and so cannot be pointed to by the assigned value) ***
>
> Here a cycle means a cyclic sequence of structure fields such that each can
> point to an instance of the structure of the next field in the sequence.
>
> So for instance this case would be allowed:
> struct A {b: Rc<B>}
> struct B {a: Option<Rc<A>>}
>
> but the A.a and B.b fields would be immutable even with &mut A or &mut B
> pointers, and would only be mutable if the A or B structure is on the stack
> or owned from the stack (and thus has no incoming rc pointers).
>
> On the other hand, this would be allowed with no restrictions since the
> points-to relationship of the types is acyclic:
> struct A {b: RcMut<B>)
> struct B {c: Rc<C>}
> struct C {x: u32}
>
> To implement this, one would need to extend the language with an annotation
> to tell the compiler that the raw pointers inside of Rc/RcMut/ARC/RWARC/etc.
> are to be treated as "reference-counted" pointers.
>
> The compiler can then perform the static analysis required (do an SCC
> decomposition of a graph containing types, enum members and pointer fields,
> with edges from types to contained types, enum members and pointer fields,
> from pointer fields to the pointed type, from traits to all possible
> implementing types, from enum to enum member, and then forbid modification
> to pointers in any SCC with at least one reference-counted pointer)
>
> There are at least two complexities: public trait pointers needs to be
> assumed to be able to point to anything (unless one can see the whole
> program), while generic types will need to be analyzed in their specialized
> versions, which means that some methods would be valid to call only for some
> values of generic type parameters, and that the analysis needs to be done
> from scratch globally for every module compilation.
This would be a big step away from the advantages of Rust's current
trait system. Right now, if the definition of a generic function type
checks, it's valid for all possible types implementing the trait
bounds. There are no hidden or implicit requirements.
> In addition to all this, weak versions of all rc pointers should be
> supported (to allow weak "parent" references, for instance), which would
> require a further annotation telling the compiler that the contained unsafe
> pointer is "weak" and thus does not participate in the pointer cycle
> analysis.
Weak pointers involve a performance hit for using reference counting
without them as covered above. Additionally, they also open up another
kind of dynamic failure (dead weak pointers) avoided by using garbage
collection.
> Also, a keyword can be provided to allow the forbidden mutations at the cost
> of doing a full run-time scan of the assigned value to ensure it doesn't
> point to the object with the field being assigned, and failing or returning
> an error if the check fails (this probably needs to be explicit since it can
> of course be very expensive to do the scan)
>
> Some structures like mutable strong doubly-linked lists would not be allowed
> by this and would instead require an ad-hoc implementation with raw pointers
> (where the list node smart pointer will increase the reference count on both
> the node, and the whole list, and splicing will be O(n)).
>
> This infrastructure should be able to handle most programs without using
> garbage collection, except those that are interpreters of garbage-collected
> languages or implement interfaces that require garbage collection (e.g.
> AF_UNIX sockets with fd passing, where the fd can be an unix socket itself).
>
> It's also possible that there are even better approaches in the literature
> or other languages.
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev