Re: [rust-dev] alias analysis

Graydon Hoare Fri, 03 Jun 2011 08:40:08 -0700

On 03/06/2011 7:29 AM, Marijn Haverbeke wrote:

[This is just a rambing e-mail outlining some problems I'm running
into. Though I am stressing these problems to make sure they are not
glossed over, I'm *not* suggesting we give up aliases or anything like
that.]

Much appreciated; if something's to be shot down, earlier is better!It's already embarrassingly late in the game to be working out theserules in full. Sadly.

The issue I'm running into is that obj types*, function types, and
type parameters are 'opaque' and, as such, can contain everything.


Ouch. This is quite a wrinkle.

This means that any boxed value returned from a function that took a
stateful obj, parameterized type, or function type (or something that
contains one!) must be suspected of being reachable from that object.

Suspected of, yes. Let's work on eliminating the suspicion. Or limitingits severity.

If we go with function parameter aliasing solution #2 (not
allowed to pass aliasing things) then any function in such a
context-passing module that takes both a context and an alias, can not
be called on any boxed (or box-containing) value that was returned
from another context-taking function. This seems like it'll invalidate
a serious percentage of the code in the current compiler, with no
obvious way to 'fix' it.

Possibly. I mean, your analysis is correct about ways it can go wrong,but I'm not sure it's always going to be a pervasive problem, orunfixable. I'd like to continue with solution #2 (let's call this the"strong induction hypothesis" solution) for a moment and consider waysof changing the code or helping it reveal its safety:


  - When we have down-fn-args (in the form of lambda blocks) we will
    be able to turn many-or-most obj field accessor methods into
    iter-like constructs, yes? Like, what we do now as:

        obj foo { fn get_bar() -> bar; }

    may well turn into:

        obj foo { fn with_bar(fn(&bar) &f); }

    such that clients stop writing:

        my_foo.get_bar().do_a_thing();

    and start writing:

        my_foo.with_bar() {| b | b.do_a_thing(); }

    which carries the pleasant performance allowance of letting the
    obj keep its bar member either allocated inline or held as a
    unique box (which can only alias with another alias, not a
    shared box). Consider if this is very common whether we want
    an attribute-like (getter/setter pair) syntax for objs.

  - Along those lines: consider what happens when we have unique boxes
    in general, and whether returning "shared box" from a function
    will be quite so common an operation. If the function's job is to
    *construct* values of type foo, then even if boxed it makes much
    more sense to return ~foo than @foo since, at the time of function
    return, the out-pointer is (probably) the sole reference anyways.

  - Further into the unique-ownership line of thinking: when there's a
    kind system up and running (which you may find a necessary component
    of formulating this analysis properly -- they're closely related!)
    it might be possible to constrain the types of an opaque to
    be "unknown but tree-shaped" (not containing shared pointers).
    We have discussed always considering obj and fn types as opaque
    to the kind system too (and assuming the worst about them) but
    perhaps this is too loose. What would happen to the issue if we
    could say "this obj type only has tree-kind memory inside it"?
    Or further, given the depth of the hazard here: what if we
    *required* that for all obj and fn types? Would shared boxes lose
    all utility? Would too many idioms stop working?

Going with solution #1 (you may pass aliasing aliases to functions)
instead, we'd be in a situation where an obj or parameterized argument
may alias with every alias passed in, which means that after passing
said obj or parameterized argument to any function you can no longer
be sure your alias is still valid. This is worse than the situation
described above.

This is true, and a reason why I still prefer #2. Though again, it maybe mitigated by some of the "different idioms, focused on uniqueness"stuff I discuss above. Still, I feel like the weaker inductionhypothesis will make too much stuff fall apart; gut feeling but the bestI have to go on. I'd like to try to get #2 to hold together.

A 'distinguish' operation would provide a way out, but if I understand
what you're proposing correctly it'll traverse the values at run-time.
Proving two big (maybe even cyclic) data structures don't share
structure is an arbitrarily expensive operation.

Agreed, this is a ... burdensome operation. Various versions might workbut they all feel like fixing the wrong problem. And adding cognitiveburden. And runtime landmines, as you say.

(Seems we're once again entering uncharted territory. As with the
effects system, that's always dangerous.)

True, and fair point. It's not *wholly* uncharted -- C and C++ both havea variety of alias-analysis-driven *optimizations* -- but the new partis making one of those analysis passes airtight enough to consider assafety guarantees.

*) Here I mean types defined like 'type foo = obj { ... }' rather than
'obj foo() { ... }'. I saw someone claiming the former syntax was
invalid in the IRC logs this week (it's not), so maybe the distinction
is not widely understood.

The latter implies an in-place definition of the former as well. Butyes, both are valid.


-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] alias analysis

Reply via email to