On Thursday, 26 February 2015 at 17:56:14 UTC, Zach the Mystic wrote:
On Wednesday, 25 February 2015 at 21:26:33 UTC, Marc Schütz wrote:
IIRC H.S. Teoh suggested a change to the compilation model. I think he wants to expand the minimal compilation unit to a library or executable. In that case, inference for all kinds of attributes will be available in many more circumstances; explicit annotation would only be necessary for exported symbols.

You probably mean Dicebot:

http://forum.dlang.org/post/[email protected]

You're right! And I just (again wrongly) implicated Martin Nowak in this, too :-P


Anyway, it is a good idea to enable scope semantics implicitly for all references involved in @safe code. As far as I understand it, this is something you suggest, right? It will eliminate annotations except in cases where a parameter is returned, which - as you note - will probably be acceptable, because it's already been suggested in DIP25.

Actually you could eliminate `return` parameters as well, I think. If the compiler has the body of a function, which it usually does, then there shouldn't be a need to mark *any* of the covariant function or parameter attributes. I think it's the kind of thing which should "Just Work" in all these cases.

Agreed. I had the export/import case in mind, where you don't have the function body. The signature then needs to contain `return` parameters, although `scope` would be implied by `@safe`.

I also think it is too coarse. Even variables declared at the same lexical scope have different lifetimes, because they are destroyed in reverse order of declaration. This is relevant if they contain references and have destructors that access the references; we need to make sure that no reference to a destroyed variable can be kept in a variable whose destructor hasn't yet run.

It might be too coarse. We could reserve a few more bits for depth-constant declaration order. At the same, time, it doesn't seem *that* urgent to me. But maybe I'm naive about this. Everything is being destroyed anyway, so what's the real danger?

struct A {
    B* b;
    ~this() {
        b.doSomething();
    }
}

struct B {
    void doSomething();
}

void foo() {
    A a;      // declscope(1)
    B b;      // declscope(1)
    a.b = &b; // refscope(1) <= declscope(1): OK
    // end of scope:
    // `b` is destroyed
    // `a`'s destructor is called
    // => your calling a method on a destroyed object
}

Basically, every variable needs to get its own declscope; all declscopes form a strict hierarchy (no partial overlaps).


Principle 5: It's always un@safe to copy a declaration scope from a higher scopedepth to a reference variable stored at lower scopedepth. DIP69 tries to banish this type of thing only in `scope` variables, but I'm not afraid to banish it in all @safe code period:

For backwards compatibility reasons, it might be better to restrict it to `scope` variables. But as all references in @safe code should be implicitly `scope`, this would mostly have the same effect.

I guess this is the "Language versus Legacy" issue. I think D's strength is in it's language, not its huge legacy codebase. Therefore, I find myself going with the #pleasebreakourcode crowd, for the sake of extending D's lead where it shines.

I'm too, actually, but it would be a really hard sell.

I'm not sure all references in safe code need to be `scope` - that would break a lot of code unto itself, right?

Not sure how much would be affected. I actually suspect that most of it already behaves as if it were scope, with the exception of newly allocated memory. But those should ideally be "owned" instead.

But your right, there still needs to be an opt-out possibility, most likely static.

Principle 10: You'll probably have noticed that all scopes accumulate each other according to lexical ordering, and that's good news, because any sane person assigns and return references in lexical order.

As you say, that's broken. But why does it need to be in lexical order in the first place? I would simply analyze the entire function first, assign reference scopes, and disallow circular relations (like `a = b; b = a;`).

T* fun(T* a, T** b) {
  T* c = new T;
  c = a;
  *b = c;
  return c;
}

Algorithm for inference of ref scopes (= parameter annotations):

1) Each variable, parameter, and the return value get a ref scope (or ref depth). A ref scope can either be another variable (including `return` and `this`) or `static`.

2) The initial ref scope of variables is themselves.

3) Each time a variable (or something reachable through a variable) is assigned (returning is assignment to the return value), i.e. for each location in the function that an assignment happens, the new scope ref will be:

3a) the scope of the source, if it is larger or equal to the old scope

3b) otherwise (for disjunct scopes, or assignment from smaller to larger scope), it is an error (could potentially violate guarantees)

4) If a source scope refers to a variable (apart from the destination itself), for which not all assignments have been processed yet, it is put into a queue, to be evaluated later. For code like `a = b; b = a;` there can be dependency cycles. Such code will be disallowed.

How exactly the scope of a complex expression has to be computed is left open here.

In the end, if there was no error, all variables, parameters and the return value will have a minimum reference scope assigned. If that scope is the variable itself, they can be inferred as `scope`. If it is a parameter, that parameter get an `out!identifier` or `return` annotation.

Note that the order in which the "assignments" occur inside the function doesn't matter. This is more restrictive than strictly necessary, but it's certainly ok in most cases, easy to work around when not, and it doesn't require data/control flow analysis.

(By the way: inference cannot work for recursive functions.)

Your example:

T* fun(T* a, T** b) {
    // => S(a) = a
    // => S(b) = b
    // => S(return) = <doesn't matter>
    T* c; // == (T*).init == null
    // => S(c) = c
    c = new T;
    // `new` returns static, which is wider than c
    // => S(c) = static
    c = a;
    // => invalid, narrowing not allowed
    // (this is what I asked about, and now I
    // see why it's necessary)
    // let's assume it didn't happen, so that
    // the next two statements work
    *b = c;
    // => S(b) = S(c) = static
    return c;
    // => S(return) = S(c) = static
}

This algorithm can also be modified slightly to allow only partial inference (only of some variables, e.g. locals, when the parameters have already been explicitly annotated), as well as for checking whether the assignments are valid in this case.

I'm a bit tired now, so maybe this contains glaring mistakes, but if so, I hope they can be fixed :-) I hope it's clear what I'm trying to do here.

Something else that needs consideration: What happens when parameters alias each other? I think it is ok, because the checking phase will naturally prohibit calling functions in a way that would break the guarantees, but I haven't thought it through completely.

It's not so simple at all. For full-blown unique ownership, there needs to be some kind of borrow-checking like in Rust. I have some ideas how a simple borrow-checker can be implemented without much work (without data flow analysis as Rust does). It's basically my "const borrowing" idea (whose one flaw incidentally cannot be triggered by unique types, because it is conditioned on the presence of aliasing).

There are still some things in the proposal that I'm sure can be simplified. We probably don't need new keywords like `noscope`. I'm not even sure the concept itself is needed.

Unless you want to flat out ban copying a parameter reference to a global in @safe code, you will need `noscope`, or, as you suggested, `static`.

You're right, it's necessary.

I'm actually thinking of reusing `noscope` as a function attribute (`@noscope` perhaps) which says that the function may return a heap or global reference. This is all that's necessary to complete an ownership system. If a scope has exactly 1 "mystery" bit set, and is known not to come from the heap or a global, then you know that it *must* contain a reference to exactly the parameter for which the mystery bit is set. You know exactly what it contains == ownership.

I will have to think about this, but I believe you cannot express such concepts as deadalnix's islands, or "const borrowing". But maybe, if we're lucky, I'm wrong :-)

Reply via email to