@Araq, thanks for the acknowledgment and the tracking link; I assumed that you
read even when you don't reply as I see changes here and there I have suggested:
> (Work has begun to make closures work with the new runtime.)
Lately, most of my thinking has been as to how closures and other heap based
data types should be treated in the new run tiime, and it has made me think
that the current spec 2 has a major problem, as follows:
1\. In the current run time, a closure can be considered (or is) a tuple as in
the following:
type Closure[Args, T] = object # or tuple
prc: proc(hiddnEnvTpl: ptr tuple[...]; args: nrmlArgs): T {.nimcall.}
env: ptr tuple[...])
Run
2 I heard you mention that you thought that the new run time could just
implement the Closure type by changing the ptr to an owned ref meaning that the
Closure is the owner of its own environment, and destroyed when the closure
goes out of scope, assuming use of an object instead of a tuple or giving
tuples the ability to have the same custom "hooks" as objects (which I highly
recommend for consistency anyway as tuples/objects are just two slightly
different forms of similar containers).
3\. However, I think that would lead to some problems as objects/tuples don't
have a concept of ownership, only the new ref's do, so as the closure gets
passed around and used it becomes unclear when it will be destroyed and whether
its lifetime exceeds its use for many edge cases, and even if the compiler can
do static flow analysis to cover that, another problem is that when the implied
ownership changes then the implied type of all the contained ref'f needs to
also change, basically meaning that a new data type is implicitly created based
on copy/move. I'm sure there are ways to work around this, but they all seem
quite complex (and therefore likely wrong ;-) ).
4\. The solution I propose is quite simple (and therefore hopefully correct):
The rule is that everything that needs to have a concept of ownership which is
everything that is allocated to the heap or contains anything allocated to the
heap must be "contained" by a ref which may be either owned or unowned
according to the new copy/move semantics. Now the new Closure would be defined
as follows:
type Closure = ref object # or tuple...
prc: proc(hiddnEnvTpl: ptr tuple[...]; args: nrmlArgs): T {.nimcall.}
env: ptr tuple[...])
Run
Note that the environment can be referenced by a owned ref but there is little
point as using just a ptr is more flexible and it will never escape the owned
object/tuple. Now this new ref object Closure obeys all of the new
destroy/copy/move semantics and can be passed around freely with respect to
those. If there is an edge case where the use exceeds the lifetime, we can
simply do a "deepCopy" of the Closure (I have ideas on simply implementing that
generally for all types, too) including allocating a fresh ptr for the copy of
the environment, and end up with as many copies as required to get the job
done, with all of the owned ones destroyed when they go out of scope in each of
their uses.
5\. Proof that this will work: we only have one owner, which means that it will
only be destroyed once no matter how many unowned references there are, and
only the =destroy will do a deep destroy down to chasing the ptr, = copying (in
the case of unowned ref) and =move only does shallow copies/swaps so just
transfer the ptr over resulting in more than one copy with the same field
having the same ptr target isn't a problem because unowned ref's never destroy.
6\. Adjunct benefit of the scheme: It overcomes one of the primary limitations
as expressed in spec 2 that a given field in two instances can now have a
pointer to the same memory area when that field is inside a ref \- with this as
explained in the proof; it now isn't a problem as it is protected by the
ownership of the ref.
7\. This "simple scheme" would be consistently applied to all data types
containing heap data as in strings and seqs and anything else in the libraries
- they will all have this same benefit. This makes the "Motivating example" in
spec 2 null and void, and I will shortly produce a new motivating example
according this this "simple rule". Unfortunately, it will mean all of the data
types which are implemented as objects containing pointers or ref pointers will
have to be changed to follow the ref object "simple plan" but the benefits are
manifold...
8\. They also will enjoy a further benefit which will boost performance in a
large number of cases: copying of these data types can now almost always be
shallow (by default) since these are reference values and as it is enabled by
the copy/move semantics of the owned/unowned ref.
9\. In the interests of simplicity I will make one more suggestion: I suggest
that we don't have a keyword "owned" as all ref's are created as owned and in
most cases implicitly converted to unowned when copied with implicit casting,
meaning that we would only rarely have to use the alternate keyword unowned as
a means of forcing the copy (it can also be forced by using a proc such as
shallowCopy which will do this for any sort of ref as argument, converting
either owned or unowned ref to unowned ref).
I've implemented enough of this in a couple of hundred lines of codes including
some tests to indicate it works like a charm, and given that the underpinnings
are in the standard libraries and the compiler, the test code looks at least as
concise as it would currently be but so much more capable and faster.
10\. A final wish just to complete the ease of use: It would be nice to be able
to do the following with an automatically generated "constructor" as follows:
type
Foo = ref object
cntnts: int
let f = Foo(cntnts: 42)
Run
instead of the following:
type
Foo = ref object
cntnts: int
proc newFoo(c: int): auto =
result = Foo(); result.cntnts = c
let f = newFoo(42)
Run
with the automatically generated constructor able to accept as many named
parameters as we want to not be default value exactly as we can do already with
objects/tuples except the constructor wraps the whole thing in a ref. This
would save some boilerplate code...
Nim is already a great language but with too many edge cases as in "when
strings/seqs are included in a ptr stucture, be sure to call GC_unref on them
else you'll have a memory leak, the protect/dispose handling, etc., all the
different ways to select a GC and change it's behaviour/performance, etc., the
limitations of GC not being cross thread...). With a working memory management
as described, it becomes a "complete" and unique language without many (any?)
edge cases, and with the entire range of use of "value" versus "reference"
bindings able to be applied.
I'm likely newer to Nim than many of you on this forum, but perhaps that's a
good thing in that you get the advantage of a fresh perspective. I assure you
that I am sincere in wanting to try to make Nim the best language it can be. I
think in many ways it has the potential to be the best imperative language with
some functional capabilities - period: This new memory management paradigm
could be **the** prime distinguishing feature, as it could be both fast and
easy to use as compared to the systems used by the "biggies" of Rust, Swift,
and C/C++ and the syntax (to me) is cleaner than all of them.