Re: Smart pointers instead of GC?

Adam D. Ruppe Tue, 04 Feb 2014 18:39:04 -0800

On Tuesday, 4 February 2014 at 22:30:39 UTC, Walter Bright wrote:

I wonder how Rust deals with this.

The only time ownership matters is if you are going to store thepointer. It is like the difference between a container and arange.

An algorithm doesn't need to know about the specifics of acontainer. Let's use average for example. We might write it in D:


int average(InputRange)(InputRange r) {
    int count = 0;
    int sum;
    while(!r.empty) {
         count++;
         sum += r.front;
         r.popFront();
    }
    return sum / count;
}

Now, this being a template, D will generate new code for avariety of types... but even if we replaced InputRange with aspecific thing, let's call it int[], it is still usable by avariety of containers:


int average(int[] r) { /* same impl */ }


D has two containers built in that provide this range:

int[50] staticArray;
int[] dynamicArray = new int[](50);

average(staticArray[]); // works
average(dynamicArray); // works

Pointers also offer this:

int* pointer = cast(int*) malloc(50 * int.sizeof);
average(pointer[0 .. 50]);



Moreover, user-defined types can also provide this range:

struct Numbers {
    int[] opSlice() { return [1,2,3]; }
}

Numbers numbers;
average(numbers[]); // works

In theory, we could provide either an inputRangeObject or a slicefor linked lists, lazy generators, anything. One function, anykind of input.

Of course, we could slice memory from any allocator. Heck, we sawthree different allocations right here (with three differenttypes! stack, gc, and malloc) all using the same function,without templating.

I'm sure none of this is new to you... and this is basically howthe rust thing works too. Our usage of int[] (or the input range)are borrowed pointers. Algorithms are written in their terms.

The ownership type only matters when you store it. And turns out,this matters in D as well:


struct ManualArray(T) {
    size_t length;
    T* data;

this(size_t len) { data = malloc(T.sizeof * len); length =len; }

    ~this() { free(data); }
    T[] opSlice() { return data[0 .. length]; }
    @disable this(this); // copying this is wrong, don't allow it!
}

void main() {
    auto array = ManualArray!int(50);
    average(array[]); // works, reusing our pointer
}


But, borrowed comes into play if we store it:

int[] globalArray;
void foo(int[] array) {
    globalArray = array;
}

void bar() {
    auto array = ManualArray!int(50);
    foo(array[]); // uh oh
}

void main() {
   bar();
   globalArray[0] = 10; // crash likely, memory safety violated
}

Again, I'm sure none of this is new to you, but it illustratesowned vs borrowed: ManualArray is owned. Storing it is safe - itensures its internal pointer is valid throughout its entire lifetime.

But ManualArray.opSlice returns a borrowed reference. Great foralgorithms or any processing that doesn't escape the reference.Anything that would be written in terms of an input range isprobably correct with this.

However, we stored the borrowed reference, which is a no-no.array went out of scope, freeing the memory, leaving the escapedborrowed reference in an invalid state.

Let's say we did want to store it. There's a few options: wecould make our own copy or store the pre-made copy.


GC!(int[]) globalArray;
void foo(GC!(int[]) array) { globalArray = array; }

That's sane, the GC owns it and we specified that so storing itis cool.

We could also take a RefCounted!(int[]), if that's how we wantedto store it.

But let's say we wanted to store it with a different method.There's only two sane options:



void foo(int[] array) { globalArray = array.dup; }

Take a borrowed reference and make a copy of it. The function foois in charge of allocating (here, we made a GC managed copy).



OR, don't implement that and force the user to decide:


void foo(GC!(int[]) array) {...}


user:

foo(ownedArray[]); // error, cannot implicitly convert int[] toGC!(int[])

int[50] stackArray;

foo(stackArray[]); // error, cannot implicitly convert int[] toGC!int[]

Now, the user makes the decision. It is going to be stored, thefunction signature says that up front by asking for anon-borrowed reference. They won't get a surprise crash when theglobalArray later accesses stack or freed data. They have to dealwith the error. They might not call the function, or they mightdo the .dup themselves. Either way, memory safety is preservedand inefficiencies are visible.

So, a function that stores a reference would only ever come inone or two signatures, regardless of how many:

1) the exact match for the callee's allocation strategy. Thecallee, knowing what the strategy is, can also be sanelyresponsible for freeing it. (A struct dtor, for example, knowsthat its members are malloced and can thus call free)

2) A generic borrowed type, e.g. input range or slice, which itthen makes a private copy of it internally. Since these arearguably hidden allocations you might not even like these.Calling .dup (or whatever) at the call sight keeps theallocations visible.

So bottom line, you don't duplicate functions for the differenttypes. You borrow references for processing (analogous toimplementing algorithms with ranges) and own references forstoring... which you need to know about, so only one type makessense.

Re: Smart pointers instead of GC?

Reply via email to