[rust-dev] alias analysis

Graydon Hoare Thu, 02 Jun 2011 14:02:46 -0700

This email is primarily for Marijn, as he's adopting the "alias analysispass" task (https://github.com/graydon/rust/issues/408) from therelease-milestone list for now, but I figured I'd write to the list soeveryone else involved can see what we're thinking (and offer commentsor suggestions) too.

The issue here is that Rust's current alias (&) slots permitcircumvention of ... most other rules in the system (type safety, memorysafety, type-state safety). Even outside any proposed "unsafe" sub-dialect.

This email is not fishing for recommendation of "how to get rid ofalias" (though we may rename it "reference" to be more C++-like); ratherit's a discussion of "the rules we will enforce on aliases to make themsafe". The entire point of & is to be "a safe version of & in C++", andit's not quite safe enough yet. Needs a bit more safety. Unfortunatelywith pointers, well, you know how it is: even "a little bit unsafe"means "can saw a hole in the universe". So & is presently.

The task of an alias-analysis pass is to reject programs it can't proveto be safely using &. It might additionally involve injectingsafety-ensuring code of sorts I'll discuss below.

First, how things can go wrong: if an alias points into a mutable memorystructure (or substructure of a mutable) that is simultaneouslyreachable via any other path (alias or otherwise), then code thatmutates the one may invalidate the other. Invalidation may mean:


  - The alias may point to memory that assumes a given typestate
    predicate, which may no longer hold (by the mutation).

  - This includes the 'init' predicate; in other words the alias
    may point to memory that was dropped, causing a UMR.

  - The alias may point to tag-variant X which has been changed to
    tag-variant Y underfoot, causing data corruption or a UMR.

There may be other cases, but these at least spring to mind. We have toprohibit all of these. Some of us have discussed this in the past (thishazard has been evident from the beginning) and have been assuming ananalysis pass should be able to reject (conservatively) such cases.

The strategy we discussed was to attempt to prove pairwise disjointnessof all non-identical lvals (named points-of-reference through which aread or write may be made). So this means building a list of N*Npairings for the N lvals in a function, and considering each pairindependently. With something like the following "main" rules:


  - A pair of *identical* lvals (resolve to the same def_id) are ok.
  - A pair of lvals that point to fully disjoint types (neither type is
    a substructure of the other) are ok.
  - A pair of lvals that point to immutable types are ok.
  - A pair of lvals where neither is rooted in an alias is ok.
  - A pair of lvals where either or both are incoming arguments to a
    function are ok.

The first few of these are obvious; the last point is the tricky one andrequires some explanation (as well as support machinery). Aliases areonly formed at function-call sites, so there is a sense in which theincoming aliases to a function are the "induction hypothesis" for allsubsequent judgments in the body of the function: things that we*assume* hold on the aliases we get, and that we must *prove* to holdover the aliases we form. We thought of four possible rule-setsgoverning the alias formation:


 1. We permit a caller to form or pass aliases that point into
    other passed arguments, and rely on the first 4 rules in the callee
    to check all actions with those aliases (as well as formation of
    new ones) are safe.

 2. We prohibit callers forming or passing an alias that may point
    into another passed argument, making the 5th rule hold in the
    callee and rejecting programs that form maybe-overlapping alias
    arguments.

 3. We permit ruels #1 and #2 but differentiate them by something
    like the C99 'restrict' keyword. The 5th "main" rule works for
    restricted aliases but not unrestricted aliases; the compiler
    must prove them disjoint using only the first four rules.
    Restricted-ness becomes a part of the type signature so the caller
    knows what it has to prove as well.

 4. We permit cases #1 and #2 but provide a "distinguish X" operation
    that *dynamically* ensures that X (an alias) doesn't point into
    any of the other lvals it's possibly non-disjoint from in the
    current scope (under the first four "main" rules). This can either
    be automatically injected or manually written by users.

We considered all four options governing formation, and tentativelysettled on the 2nd, IIRC, but the other 3 options remain possible hereas well. In any case we should build one and see how usable orcognitively burdensome it is.


-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

[rust-dev] alias analysis

Reply via email to