Re: Owned refs

GordonBGood Wed, 12 Jun 2019 03:15:28 -0700

@Araq, thanks for the acknowledgment and the tracking link; I assumed that you 
read even when you don't reply as I see changes here and there I have suggested:


> (Work has begun to make closures work with the new runtime.)

Lately, most of my thinking has been as to how closures and other heap based 
data types should be treated in the new run tiime, and it has made me think 
that the current spec 2 has a major problem, as follows:

1\. In the current run time, a closure can be considered (or is) a tuple as in 
the following: 
    
    
    type Closure[Args, T] = object # or tuple
      prc: proc(hiddnEnvTpl: ptr tuple[...]; args: nrmlArgs): T {.nimcall.}
      env: ptr tuple[...])
    
    
    Run

2 I heard you mention that you thought that the new run time could just 
implement the Closure type by changing the ptr to an owned ref meaning that the 
Closure is the owner of its own environment, and destroyed when the closure 
goes out of scope, assuming use of an object instead of a tuple or giving 
tuples the ability to have the same custom "hooks" as objects (which I highly 
recommend for consistency anyway as tuples/objects are just two slightly 
different forms of similar containers).

3\. However, I think that would lead to some problems as objects/tuples don't 
have a concept of ownership, only the new ref's do, so as the closure gets 
passed around and used it becomes unclear when it will be destroyed and whether 
its lifetime exceeds its use for many edge cases, and even if the compiler can 
do static flow analysis to cover that, another problem is that when the implied 
ownership changes then the implied type of all the contained ref'f needs to 
also change, basically meaning that a new data type is implicitly created based 
on copy/move. I'm sure there are ways to work around this, but they all seem 
quite complex (and therefore likely wrong ;-) ).

4\. The solution I propose is quite simple (and therefore hopefully correct): 
The rule is that everything that needs to have a concept of ownership which is 
everything that is allocated to the heap or contains anything allocated to the 
heap must be "contained" by a ref which may be either owned or unowned 
according to the new copy/move semantics. Now the new Closure would be defined 
as follows: 
    
    
    type Closure = ref object # or tuple...
      prc: proc(hiddnEnvTpl: ptr tuple[...]; args: nrmlArgs): T {.nimcall.}
      env: ptr tuple[...])
    
    Run

Note that the environment can be referenced by a owned ref but there is little 
point as using just a ptr is more flexible and it will never escape the owned 
object/tuple. Now this new ref object Closure obeys all of the new 
destroy/copy/move semantics and can be passed around freely with respect to 
those. If there is an edge case where the use exceeds the lifetime, we can 
simply do a "deepCopy" of the Closure (I have ideas on simply implementing that 
generally for all types, too) including allocating a fresh ptr for the copy of 
the environment, and end up with as many copies as required to get the job 
done, with all of the owned ones destroyed when they go out of scope in each of 
their uses.

5\. Proof that this will work: we only have one owner, which means that it will 
only be destroyed once no matter how many unowned references there are, and 
only the =destroy will do a deep destroy down to chasing the ptr, = copying (in 
the case of unowned ref) and =move only does shallow copies/swaps so just 
transfer the ptr over resulting in more than one copy with the same field 
having the same ptr target isn't a problem because unowned ref's never destroy.

6\. Adjunct benefit of the scheme: It overcomes one of the primary limitations 
as expressed in spec 2 that a given field in two instances can now have a 
pointer to the same memory area when that field is inside a ref \- with this as 
explained in the proof; it now isn't a problem as it is protected by the 
ownership of the ref.

7\. This "simple scheme" would be consistently applied to all data types 
containing heap data as in strings and seqs and anything else in the libraries 
- they will all have this same benefit. This makes the "Motivating example" in 
spec 2 null and void, and I will shortly produce a new motivating example 
according this this "simple rule". Unfortunately, it will mean all of the data 
types which are implemented as objects containing pointers or ref pointers will 
have to be changed to follow the ref object "simple plan" but the benefits are 
manifold...

8\. They also will enjoy a further benefit which will boost performance in a 
large number of cases: copying of these data types can now almost always be 
shallow (by default) since these are reference values and as it is enabled by 
the copy/move semantics of the owned/unowned ref.

9\. In the interests of simplicity I will make one more suggestion: I suggest 
that we don't have a keyword "owned" as all ref's are created as owned and in 
most cases implicitly converted to unowned when copied with implicit casting, 
meaning that we would only rarely have to use the alternate keyword unowned as 
a means of forcing the copy (it can also be forced by using a proc such as 
shallowCopy which will do this for any sort of ref as argument, converting 
either owned or unowned ref to unowned ref).

I've implemented enough of this in a couple of hundred lines of codes including 
some tests to indicate it works like a charm, and given that the underpinnings 
are in the standard libraries and the compiler, the test code looks at least as 
concise as it would currently be but so much more capable and faster.

10\. A final wish just to complete the ease of use: It would be nice to be able 
to do the following with an automatically generated "constructor" as follows: 
    
    
    type
      Foo = ref object
        cntnts: int
    
    let f = Foo(cntnts: 42)
    
    
    Run

instead of the following: 
    
    
    type
      Foo = ref object
        cntnts: int
    proc newFoo(c: int): auto =
      result = Foo(); result.cntnts = c
    
    let f = newFoo(42)
    
    
    Run

with the automatically generated constructor able to accept as many named 
parameters as we want to not be default value exactly as we can do already with 
objects/tuples except the constructor wraps the whole thing in a ref. This 
would save some boilerplate code...

Nim is already a great language but with too many edge cases as in "when 
strings/seqs are included in a ptr stucture, be sure to call GC_unref on them 
else you'll have a memory leak, the protect/dispose handling, etc., all the 
different ways to select a GC and change it's behaviour/performance, etc., the 
limitations of GC not being cross thread...). With a working memory management 
as described, it becomes a "complete" and unique language without many (any?) 
edge cases, and with the entire range of use of "value" versus "reference" 
bindings able to be applied.

I'm likely newer to Nim than many of you on this forum, but perhaps that's a 
good thing in that you get the advantage of a fresh perspective. I assure you 
that I am sincere in wanting to try to make Nim the best language it can be. I 
think in many ways it has the potential to be the best imperative language with 
some functional capabilities - period: This new memory management paradigm 
could be **the** prime distinguishing feature, as it could be both fast and 
easy to use as compared to the systems used by the "biggies" of Rust, Swift, 
and C/C++ and the syntax (to me) is cleaner than all of them.

Re: Owned refs

Reply via email to