I didn't yet have much time to look at it closely enough, but I'll already make some comments.

On Wednesday, 25 February 2015 at 01:12:15 UTC, Zach the Mystic wrote:
Principle 3: Extra function and parameter attributes are the tradeoff for great memory safety. There is no other way to support both encapsulation of control flow (Principle 2) and the separate-compilation model (indispensable to D). Function signatures pay the price for this with their expanding size. I try to create the new attributes for the rare case, as opposed to the common one, so that they don't appear very often.

IIRC H.S. Teoh suggested a change to the compilation model. I think he wants to expand the minimal compilation unit to a library or executable. In that case, inference for all kinds of attributes will be available in many more circumstances; explicit annotation would only be necessary for exported symbols.

Anyway, it is a good idea to enable scope semantics implicitly for all references involved in @safe code. As far as I understand it, this is something you suggest, right? It will eliminate annotations except in cases where a parameter is returned, which - as you note - will probably be acceptable, because it's already been suggested in DIP25.


Principle 4: Scopes. My system has its own notion of scopes. They are compile time information, used by the compiler to ensure safety. Every declaration which holds data at runtime must have a scope, called its "declaration scope". Every reference type (defined below in Principle 6) will have an additional scope called its "reference scope". A scope consists of a very short bit array, with a minimum of approximately 16 bits and reasonable maximum of 32, let's say. For this proposal I'm using 16, in order to emphasize this system's memory efficiency. 32 bits would not change anything fundamental, only allow the compiler to be a little more precise about what's safe and what's not, which is not a big deal since it conservatively defaults to @system when it doesn't know.

This bitmask seems to be mostly an implementation detail. AFAIU, further below you're introducing some things that make it visible to the user. I'm not convinced this is a good idea; it looks complicated for sure.

I also think it is too coarse. Even variables declared at the same lexical scope have different lifetimes, because they are destroyed in reverse order of declaration. This is relevant if they contain references and have destructors that access the references; we need to make sure that no reference to a destroyed variable can be kept in a variable whose destructor hasn't yet run.


So what are these bits? Reserve 4 bits for an unsigned integer (range 0-15) I call "scopedepth". Scopedepth is easier for me to think about than lifetime, of which it is simply the inverse, with (0) scopedepth being infinite lifetime, 1 having a lifetime at function scope, etc. Anyway, a declaration's scopedepth is determined according to logic similar that found in DIP69 and Mark Schutz's proposal:

int r; // declaration scopedepth(0)

void fun(int a /*scopedepth(0)*/) {

(Already pointed out by deadalnix.) Why do parameters have the same depth as globals?

  int b; // depth(1)
  {
    int c; // depth(2)
    {
      int d; // (3)
    }
    {
      int e; // (3)
    }
  }
  int f; // (1)
}

Principle 5: It's always un@safe to copy a declaration scope from a higher scopedepth to a reference variable stored at lower scopedepth. DIP69 tries to banish this type of thing only in `scope` variables, but I'm not afraid to banish it in all @safe code period:

For backwards compatibility reasons, it might be better to restrict it to `scope` variables. But as all references in @safe code should be implicitly `scope`, this would mostly have the same effect.

Principle 6: Reference variables: Any data which stores a reference is a "reference variable". That includes any pointer, class instance, array/slice, `ref` parameter, or any struct containing any of those. For the sake of simplicity, I boil _all_ of these down to "T*" in this proposal. All reference types are effectively the _same_ in this regard. DIP25 does not indicate that it has any interest in expanding beyond `ref` parameters. But all reference types are unsafe in exactly the same way as `ref` is. (By the way, see footnote [1] for why I think `ref` is much different from `scope`). I don't understand the restriction of dIP25 to `ref` paramteres only. Part of my system is to expand `return` parameter to all reference types.

Fully agree with the necessity to apply it to all kinds of references, of course.

Principle 8: Any time a reference is copied, the reference
  ^^^^^^^^^^^
  Principle 7 ?
scope inherits the *maximum* of the two scope depths:

T* gru() {
  static T st; // decl depth(0)
  T t; // decl depth(1)
  T* tp = &t; // ref depth(1)
  tp = &st; // ref depth STILL (1)
  return tp; // error!
}

If you have ever loaded a reference with a local scope, it retains that scope level permanently, ensuring the safety of the reference.

Why is this rule necessary? Can you show an example what could go wrong without it? I assume it's just there to ease implementation (avoids the need for data flow analysis)?

T* fun(T* a, T* b, T** c) {
  // the function's "return scope" accumulates `a` here
  return a;
  T* d = b; // `d's reference scope accumulates `b`

  // the return scope now accumulates `b` from `d`
  return d;

  *c = d; // now mutable parameter `c` gets `d`

  static T* t;
  *t = b; // this might be safe, but only the caller can know
}

All this accumulation results in the implicit function signature:

T* fun(return T* a, // DIP25
       return noscope T* d, // DIP25 and DIP71
       out!b T** c  // from DIP71
       ) @safe;

I supposed that's about attribute inference?

Principle 10: You'll probably have noticed that all scopes accumulate each other according to lexical ordering, and that's good news, because any sane person assigns and return references in lexical order.

As you say, that's broken. But why does it need to be in lexical order in the first place? I would simply analyze the entire function first, assign reference scopes, and disallow circular relations (like `a = b; b = a;`).

Conclusion

1. With this system as foundation, an effective ownership system is easily within reach. Just confine the outgoing scopes to a single parameter and no globals, and you have your ownership. You might need another (rare) function attribute to help with this, and a storage class (e.g. `scope`, `unique`) to give you an error when you do something wrong, but the groundwork is 90% laid.

It's not so simple at all. For full-blown unique ownership, there needs to be some kind of borrow-checking like in Rust. I have some ideas how a simple borrow-checker can be implemented without much work (without data flow analysis as Rust does). It's basically my "const borrowing" idea (whose one flaw incidentally cannot be triggered by unique types, because it is conditioned on the presence of aliasing).

There are still some things in the proposal that I'm sure can be simplified. We probably don't need new keywords like `noscope`. I'm not even sure the concept itself is needed.

That all said, I think you're on the right track. The fact that you don't require a new type modifier will make Walter very happy. This looks pretty good!

Reply via email to