Re: My Reference Safety System (DIP???)

via Digitalmars-d Thu, 26 Feb 2015 13:36:02 -0800

On Thursday, 26 February 2015 at 17:56:14 UTC, Zach the Mysticwrote:

On Wednesday, 25 February 2015 at 21:26:33 UTC, Marc Schützwrote:
IIRC H.S. Teoh suggested a change to the compilation model. Ithink he wants to expand the minimal compilation unit to alibrary or executable. In that case, inference for all kindsof attributes will be available in many more circumstances;explicit annotation would only be necessary for exportedsymbols.
You probably mean Dicebot:
http://forum.dlang.org/post/[email protected]

You're right! And I just (again wrongly) implicated Martin Nowakin this, too :-P

Anyway, it is a good idea to enable scope semantics implicitlyfor all references involved in @safe code. As far as Iunderstand it, this is something you suggest, right? It willeliminate annotations except in cases where a parameter isreturned, which - as you note - will probably be acceptable,because it's already been suggested in DIP25.
Actually you could eliminate `return` parameters as well, Ithink. If the compiler has the body of a function, which itusually does, then there shouldn't be a need to mark *any* ofthe covariant function or parameter attributes. I think it'sthe kind of thing which should "Just Work" in all these cases.

Agreed. I had the export/import case in mind, where you don'thave the function body. The signature then needs to contain`return` parameters, although `scope` would be implied by `@safe`.

I also think it is too coarse. Even variables declared at thesame lexical scope have different lifetimes, because they aredestroyed in reverse order of declaration. This is relevant ifthey contain references and have destructors that access thereferences; we need to make sure that no reference to adestroyed variable can be kept in a variable whose destructorhasn't yet run.
It might be too coarse. We could reserve a few more bits fordepth-constant declaration order. At the same, time, it doesn'tseem *that* urgent to me. But maybe I'm naive about this.Everything is being destroyed anyway, so what's the real danger?


struct A {
    B* b;
    ~this() {
        b.doSomething();
    }
}

struct B {
    void doSomething();
}

void foo() {
    A a;      // declscope(1)
    B b;      // declscope(1)
    a.b = &b; // refscope(1) <= declscope(1): OK
    // end of scope:
    // `b` is destroyed
    // `a`'s destructor is called
    // => your calling a method on a destroyed object
}

Basically, every variable needs to get its own declscope; alldeclscopes form a strict hierarchy (no partial overlaps).

Principle 5: It's always un@safe to copy a declaration scopefrom a higher scopedepth to a reference variable stored atlower scopedepth. DIP69 tries to banish this type of thingonly in `scope` variables, but I'm not afraid to banish it inall @safe code period:
For backwards compatibility reasons, it might be better torestrict it to `scope` variables. But as all references in@safe code should be implicitly `scope`, this would mostlyhave the same effect.
I guess this is the "Language versus Legacy" issue. I think D'sstrength is in it's language, not its huge legacy codebase.Therefore, I find myself going with the #pleasebreakourcodecrowd, for the sake of extending D's lead where it shines.


I'm too, actually, but it would be a really hard sell.

I'm not sure all references in safe code need to be `scope` -that would break a lot of code unto itself, right?

Not sure how much would be affected. I actually suspect that mostof it already behaves as if it were scope, with the exception ofnewly allocated memory. But those should ideally be "owned"instead.

But your right, there still needs to be an opt-out possibility,most likely static.

Principle 10: You'll probably have noticed that all scopesaccumulate each other according to lexical ordering, andthat's good news, because any sane person assigns and returnreferences in lexical order.
As you say, that's broken. But why does it need to be inlexical order in the first place? I would simply analyze theentire function first, assign reference scopes, and disallowcircular relations (like `a = b; b = a;`).
T* fun(T* a, T** b) {
  T* c = new T;
  c = a;
  *b = c;
  return c;
}


Algorithm for inference of ref scopes (= parameter annotations):

1) Each variable, parameter, and the return value get a ref scope(or ref depth). A ref scope can either be another variable(including `return` and `this`) or `static`.


2) The initial ref scope of variables is themselves.

3) Each time a variable (or something reachable through avariable) is assigned (returning is assignment to the returnvalue), i.e. for each location in the function that an assignmenthappens, the new scope ref will be:

3a) the scope of the source, if it is larger or equal to the oldscope

3b) otherwise (for disjunct scopes, or assignment from smaller tolarger scope), it is an error (could potentially violateguarantees)

4) If a source scope refers to a variable (apart from thedestination itself), for which not all assignments have beenprocessed yet, it is put into a queue, to be evaluated later. Forcode like `a = b; b = a;` there can be dependency cycles. Suchcode will be disallowed.

How exactly the scope of a complex expression has to be computedis left open here.

In the end, if there was no error, all variables, parameters andthe return value will have a minimum reference scope assigned. Ifthat scope is the variable itself, they can be inferred as`scope`. If it is a parameter, that parameter get an`out!identifier` or `return` annotation.

Note that the order in which the "assignments" occur inside thefunction doesn't matter. This is more restrictive than strictlynecessary, but it's certainly ok in most cases, easy to workaround when not, and it doesn't require data/control flowanalysis.


(By the way: inference cannot work for recursive functions.)

Your example:

T* fun(T* a, T** b) {
    // => S(a) = a
    // => S(b) = b
    // => S(return) = <doesn't matter>
    T* c; // == (T*).init == null
    // => S(c) = c
    c = new T;
    // `new` returns static, which is wider than c
    // => S(c) = static
    c = a;
    // => invalid, narrowing not allowed
    // (this is what I asked about, and now I
    // see why it's necessary)
    // let's assume it didn't happen, so that
    // the next two statements work
    *b = c;
    // => S(b) = S(c) = static
    return c;
    // => S(return) = S(c) = static
}

This algorithm can also be modified slightly to allow onlypartial inference (only of some variables, e.g. locals, when theparameters have already been explicitly annotated), as well asfor checking whether the assignments are valid in this case.

I'm a bit tired now, so maybe this contains glaring mistakes, butif so, I hope they can be fixed :-) I hope it's clear what I'mtrying to do here.

Something else that needs consideration: What happens whenparameters alias each other? I think it is ok, because thechecking phase will naturally prohibit calling functions in a waythat would break the guarantees, but I haven't thought it throughcompletely.

It's not so simple at all. For full-blown unique ownership,there needs to be some kind of borrow-checking like in Rust. Ihave some ideas how a simple borrow-checker can be implementedwithout much work (without data flow analysis as Rust does).It's basically my "const borrowing" idea (whose one flawincidentally cannot be triggered by unique types, because itis conditioned on the presence of aliasing).
There are still some things in the proposal that I'm sure canbe simplified. We probably don't need new keywords like`noscope`. I'm not even sure the concept itself is needed.
Unless you want to flat out ban copying a parameter referenceto a global in @safe code, you will need `noscope`, or, as yousuggested, `static`.


You're right, it's necessary.

I'm actually thinking of reusing `noscope` as a functionattribute (`@noscope` perhaps) which says that the function mayreturn a heap or global reference. This is all that's necessaryto complete an ownership system. If a scope has exactly 1"mystery" bit set, and is known not to come from the heap or aglobal, then you know that it *must* contain a reference toexactly the parameter for which the mystery bit is set. Youknow exactly what it contains == ownership.

I will have to think about this, but I believe you cannot expresssuch concepts as deadalnix's islands, or "const borrowing". Butmaybe, if we're lucky, I'm wrong :-)

Re: My Reference Safety System (DIP???)

Reply via email to