Re: My Reference Safety System (DIP???)

via Digitalmars-d Wed, 25 Feb 2015 13:33:01 -0800

I didn't yet have much time to look at it closely enough, butI'll already make some comments.

On Wednesday, 25 February 2015 at 01:12:15 UTC, Zach the Mysticwrote:

Principle 3: Extra function and parameter attributes are thetradeoff for great memory safety. There is no other way tosupport both encapsulation of control flow (Principle 2) andthe separate-compilation model (indispensable to D). Functionsignatures pay the price for this with their expanding size. Itry to create the new attributes for the rare case, as opposedto the common one, so that they don't appear very often.

IIRC H.S. Teoh suggested a change to the compilation model. Ithink he wants to expand the minimal compilation unit to alibrary or executable. In that case, inference for all kinds ofattributes will be available in many more circumstances; explicitannotation would only be necessary for exported symbols.

Anyway, it is a good idea to enable scope semantics implicitlyfor all references involved in @safe code. As far as I understandit, this is something you suggest, right? It will eliminateannotations except in cases where a parameter is returned, which- as you note - will probably be acceptable, because it's alreadybeen suggested in DIP25.

Principle 4: Scopes. My system has its own notion of scopes.They are compile time information, used by the compiler toensure safety. Every declaration which holds data at runtimemust have a scope, called its "declaration scope". Everyreference type (defined below in Principle 6) will have anadditional scope called its "reference scope". A scope consistsof a very short bit array, with a minimum of approximately 16bits and reasonable maximum of 32, let's say. For this proposalI'm using 16, in order to emphasize this system's memoryefficiency. 32 bits would not change anything fundamental, onlyallow the compiler to be a little more precise about what'ssafe and what's not, which is not a big deal since itconservatively defaults to @system when it doesn't know.

This bitmask seems to be mostly an implementation detail. AFAIU,further below you're introducing some things that make it visibleto the user. I'm not convinced this is a good idea; it lookscomplicated for sure.

I also think it is too coarse. Even variables declared at thesame lexical scope have different lifetimes, because they aredestroyed in reverse order of declaration. This is relevant ifthey contain references and have destructors that access thereferences; we need to make sure that no reference to a destroyedvariable can be kept in a variable whose destructor hasn't yetrun.

So what are these bits? Reserve 4 bits for an unsigned integer(range 0-15) I call "scopedepth". Scopedepth is easier for meto think about than lifetime, of which it is simply theinverse, with (0) scopedepth being infinite lifetime, 1 havinga lifetime at function scope, etc. Anyway, a declaration'sscopedepth is determined according to logic similar that foundin DIP69 and Mark Schutz's proposal:
int r; // declaration scopedepth(0)

void fun(int a /*scopedepth(0)*/) {

(Already pointed out by deadalnix.) Why do parameters have thesame depth as globals?

  int b; // depth(1)
  {
    int c; // depth(2)
    {
      int d; // (3)
    }
    {
      int e; // (3)
    }
  }
  int f; // (1)
}
Principle 5: It's always un@safe to copy a declaration scopefrom a higher scopedepth to a reference variable stored atlower scopedepth. DIP69 tries to banish this type of thing onlyin `scope` variables, but I'm not afraid to banish it in all@safe code period:

For backwards compatibility reasons, it might be better torestrict it to `scope` variables. But as all references in @safecode should be implicitly `scope`, this would mostly have thesame effect.

Principle 6: Reference variables: Any data which stores areference is a "reference variable". That includes any pointer,class instance, array/slice, `ref` parameter, or any structcontaining any of those. For the sake of simplicity, I boil_all_ of these down to "T*" in this proposal. All referencetypes are effectively the _same_ in this regard. DIP25 does notindicate that it has any interest in expanding beyond `ref`parameters. But all reference types are unsafe in exactly thesame way as `ref` is. (By the way, see footnote [1] for why Ithink `ref` is much different from `scope`). I don't understandthe restriction of dIP25 to `ref` paramteres only. Part of mysystem is to expand `return` parameter to all reference types.

Fully agree with the necessity to apply it to all kinds ofreferences, of course.

Principle 8: Any time a reference is copied, the reference

  ^^^^^^^^^^^
  Principle 7 ?

scope inherits the *maximum* of the two scope depths:

T* gru() {
  static T st; // decl depth(0)
  T t; // decl depth(1)
  T* tp = &t; // ref depth(1)
  tp = &st; // ref depth STILL (1)
  return tp; // error!
}
If you have ever loaded a reference with a local scope, itretains that scope level permanently, ensuring the safety ofthe reference.

Why is this rule necessary? Can you show an example what could gowrong without it? I assume it's just there to ease implementation(avoids the need for data flow analysis)?

T* fun(T* a, T* b, T** c) {
  // the function's "return scope" accumulates `a` here
  return a;
  T* d = b; // `d's reference scope accumulates `b`

  // the return scope now accumulates `b` from `d`
  return d;

  *c = d; // now mutable parameter `c` gets `d`

  static T* t;
  *t = b; // this might be safe, but only the caller can know
}

All this accumulation results in the implicit functionsignature:


T* fun(return T* a, // DIP25
       return noscope T* d, // DIP25 and DIP71
       out!b T** c  // from DIP71
       ) @safe;


I supposed that's about attribute inference?

Principle 10: You'll probably have noticed that all scopesaccumulate each other according to lexical ordering, and that'sgood news, because any sane person assigns and returnreferences in lexical order.

As you say, that's broken. But why does it need to be in lexicalorder in the first place? I would simply analyze the entirefunction first, assign reference scopes, and disallow circularrelations (like `a = b; b = a;`).

Conclusion
1. With this system as foundation, an effective ownershipsystem is easily within reach. Just confine the outgoing scopesto a single parameter and no globals, and you have yourownership. You might need another (rare) function attribute tohelp with this, and a storage class (e.g. `scope`, `unique`) togive you an error when you do something wrong, but thegroundwork is 90% laid.

It's not so simple at all. For full-blown unique ownership, thereneeds to be some kind of borrow-checking like in Rust. I havesome ideas how a simple borrow-checker can be implemented withoutmuch work (without data flow analysis as Rust does). It'sbasically my "const borrowing" idea (whose one flaw incidentallycannot be triggered by unique types, because it is conditioned onthe presence of aliasing).

There are still some things in the proposal that I'm sure can besimplified. We probably don't need new keywords like `noscope`.I'm not even sure the concept itself is needed.

That all said, I think you're on the right track. The fact thatyou don't require a new type modifier will make Walter veryhappy. This looks pretty good!

Re: My Reference Safety System (DIP???)

Reply via email to