This email is primarily for Marijn, as he's adopting the "alias analysis pass" task (https://github.com/graydon/rust/issues/408) from the release-milestone list for now, but I figured I'd write to the list so everyone else involved can see what we're thinking (and offer comments or suggestions) too.

The issue here is that Rust's current alias (&) slots permit circumvention of ... most other rules in the system (type safety, memory safety, type-state safety). Even outside any proposed "unsafe" sub-dialect.

This email is not fishing for recommendation of "how to get rid of alias" (though we may rename it "reference" to be more C++-like); rather it's a discussion of "the rules we will enforce on aliases to make them safe". The entire point of & is to be "a safe version of & in C++", and it's not quite safe enough yet. Needs a bit more safety. Unfortunately with pointers, well, you know how it is: even "a little bit unsafe" means "can saw a hole in the universe". So & is presently.

The task of an alias-analysis pass is to reject programs it can't prove to be safely using &. It might additionally involve injecting safety-ensuring code of sorts I'll discuss below.

First, how things can go wrong: if an alias points into a mutable memory structure (or substructure of a mutable) that is simultaneously reachable via any other path (alias or otherwise), then code that mutates the one may invalidate the other. Invalidation may mean:

  - The alias may point to memory that assumes a given typestate
    predicate, which may no longer hold (by the mutation).

  - This includes the 'init' predicate; in other words the alias
    may point to memory that was dropped, causing a UMR.

  - The alias may point to tag-variant X which has been changed to
    tag-variant Y underfoot, causing data corruption or a UMR.

There may be other cases, but these at least spring to mind. We have to prohibit all of these. Some of us have discussed this in the past (this hazard has been evident from the beginning) and have been assuming an analysis pass should be able to reject (conservatively) such cases.

The strategy we discussed was to attempt to prove pairwise disjointness of all non-identical lvals (named points-of-reference through which a read or write may be made). So this means building a list of N*N pairings for the N lvals in a function, and considering each pair independently. With something like the following "main" rules:

  - A pair of *identical* lvals (resolve to the same def_id) are ok.
  - A pair of lvals that point to fully disjoint types (neither type is
    a substructure of the other) are ok.
  - A pair of lvals that point to immutable types are ok.
  - A pair of lvals where neither is rooted in an alias is ok.
  - A pair of lvals where either or both are incoming arguments to a
    function are ok.

The first few of these are obvious; the last point is the tricky one and requires some explanation (as well as support machinery). Aliases are only formed at function-call sites, so there is a sense in which the incoming aliases to a function are the "induction hypothesis" for all subsequent judgments in the body of the function: things that we *assume* hold on the aliases we get, and that we must *prove* to hold over the aliases we form. We thought of four possible rule-sets governing the alias formation:

 1. We permit a caller to form or pass aliases that point into
    other passed arguments, and rely on the first 4 rules in the callee
    to check all actions with those aliases (as well as formation of
    new ones) are safe.

 2. We prohibit callers forming or passing an alias that may point
    into another passed argument, making the 5th rule hold in the
    callee and rejecting programs that form maybe-overlapping alias
    arguments.

 3. We permit ruels #1 and #2 but differentiate them by something
    like the C99 'restrict' keyword. The 5th "main" rule works for
    restricted aliases but not unrestricted aliases; the compiler
    must prove them disjoint using only the first four rules.
    Restricted-ness becomes a part of the type signature so the caller
    knows what it has to prove as well.

 4. We permit cases #1 and #2 but provide a "distinguish X" operation
    that *dynamically* ensures that X (an alias) doesn't point into
    any of the other lvals it's possibly non-disjoint from in the
    current scope (under the first four "main" rules). This can either
    be automatically injected or manually written by users.

We considered all four options governing formation, and tentatively settled on the 2nd, IIRC, but the other 3 options remain possible here as well. In any case we should build one and see how usable or cognitively burdensome it is.

-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to