On 15/08/2011 7:41 AM, Niko Matsakis wrote:

I am taking some time to experiment with Rust and try to get a better
understanding of the syntax and semantics. I have been experimenting
with small programs and just wanted to check and see if I am
understanding things correctly.

Glad to hear your questions! Sorry I've been slow in responding to this.

* If you have a variable of type "T" (for example, "int" or "vec[int]"),
this is a value type. That is, the value itself is generally immutable,
with the exception of objects and records, which may have mutable
fields. This is essentially the same as in O'Caml.

More or less. We're in the likely to change the 'mutable' field-qualifier to be a full type constructor, for a mutable-cell-that-holds-T, but the distinction is mostly fine detail. It means you'll be able to write "cell 10" to construct a mutable-cell-of-int rather than only fields-within-records or such. Hopefully simplifies things. We've flip-flopped on this issue a few times already.

Also note that vec[int] has gone away from the compiler; we've been changing from always-shared-and-refcounted vectors to more explicitly differentiated vectors: those with uniquely-owned components and/or possibly-shared components. Vectors are somewhat of a special case though. For the rest of this email, let's try talking about non-vectors -- say, integers, tuples, records, etc. -- as it makes the topic a bit clearer :)

* A box type like @T is an instance of type T that lives in the heap.
The value in the box is immutable. To make it mutable, you do @mutable T.

Correct. More specifically @T is a *shared* boxed T. It has a refcount and/or a word of GC header, such that multiple @T variables can point to the same shared heap allocation.

We are also in the process of implementing a *unique* box type ~T where there is always exactly one ~T variable pointing to the heap allocation.

* An alias type like &T is a pointer to a value of type T. This value
must live either on the stack or in the field of a record. It may point
to a mutable location or a non-mutable location.

Correct. An alias can point to a stack-local, a field of a record, the interior of a shared box ... any such place. But the compiler is obliged to prove that the referent outlives the reference, or else reject compiling the code that forms the alias.

* A mutable alias type like "& mutable T" is a pointer to a mutable
location of type T.

Correct.

* Equality is "deep" equality for values of type "T", "&T", "*T",
pointer equality for mutable values like "@mutable T" or records with
mutable fields.

Not quite. I think at present we're doing (or aiming to do) this:

   T :  interior values, compare contents
  ~T :  unique pointers, compare contents
  @T :  shared pointers, compare pointer-value
  *T :  unsafe pointers, compare pointer-value
  &T :  alias values, compare however aliased type compares

In particular, I am trying to understand the precise intention beyond
alias types (&T). Is it correct that an alias type points to some
variable whose address is guaranteed to outlive the current function?

Yes.

The compiler presumably wants to be able to copy values of type &T
without worrying about reference counts or garbage collection?

Yes. The & type is for passing parameters by-pointer when you really only intend to "loan access to the referent" to the callee, not actually give ownership (shared or otherwise).

This
would explain why a variable "x" of type "@T" cannot be used as the
value for a parameter of type "&T", and one must write "*x" instead,
because "*x" makes a temporary copy of the value at the time of the call.

It doesn't make a temporary copy. It just dereferences the shared @ pointer, so the alias points to the interior of the box. Otherwise the alias would point to the @ itself.

For example, all this code is correct (if slightly odd):

fn takes_a_box_alias(x: &@int) { log *x; }

fn takes_an_int_alias(x: &int) { log x; }

fn main() {
    let y = @10;
    takes_a_box_alias(y);
    takes_an_int_alias(*y);
}

I haven't begun looking at unique types like "~T" yet, but I know they
exist. Not sure if there are other type variations I am not aware of.

No, these are all the pointer types.

I haven't explored memory management much in my own tests, but from
reading threads I believe that non-boxed values are (conceptually)
> deallocated when the variables go out of scope (the implementation
> may internally use reference counters, e.g., for vectors, but this
> is invisible to the user).  Boxed values (@T) are deallocated when
> their reference count reaches zero."

More or less. Interior (allocated-in-frame) variables, yes, they die when they go out of scope.

We initially based the @ box type entirely on reference counting, with that being a mandatory part of the semantics to ensure proper top-down destruction order on values with destructors. We initially prohibited cycles (this was before publishing anything), then later relaxed that restriction and tried stratifying types into maybe-cyclic and definitely-acyclic (with GC applied only to the former). There are still vestiges of this distinction in the runtime interface.

Now that we have unique boxes and a kind system that differentiates unique types from shared (rather than maybe-cyclic from definitely-acyclic) we're likely to make the refcounting/GC distinction within the shared kind more vague, as you say, and just require that types carrying destructors ("resources") don't inhabit the shared kind, so there's always a single owner and an unambiguous top-down destruction sequence on *them*. Then shared boxes may wind up being any mix of refcounting and general cycle-aware GC; I expect a few of the project members will make a branch where those boxes are 100% GC'ed to see how it performs, we'll have to see.

Hth, feel free to ask followup questions.

-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to