Re: [bitc-dev] Rust, GC, and language politics

Jonathan S. Shapiro Mon, 29 Jul 2013 00:21:46 -0700

On Sat, Jul 27, 2013 at 11:47 PM, David Jeske <[email protected]> wrote:

> On Sat, Jul 27, 2013 at 9:59 AM, Jonathan S. Shapiro <[email protected]>wrote:
>
>> But more importantly, shifting most of this burden to the compiler would
>> make an even bigger difference, and in most cases the compiler is where it
>> belongs.
>>
>
> Certainly non-escaping argument status can be inferred in many cases.
> However, how does a compiler automatically make a choice about whether a
> library allocates something for return on the GC-heap or
> stack+borrowed-lifetime-pointer?
>

I don't know if I'm understanding the question. One answer is that if the
return type is stack-allocated then it obviously goes back on the stack.
Another answer is that the region of the return pointer is parametric, and
unifies with the appropriate region from the caller. Parametric regions
make that sort of thing pretty straightforward.

The real question, I think, is whether a region parameter needs to get
passed in explicitly in order for the object to get allocated in the proper
region heap. Unfortunately I think the answer may turn out to be "yes".

>
>
>> 3. *There are two heaps, not just one, so beginners are confused as to
>> which one to allocate into.* Well, yes. That really *is* confusing. And
>> it's compounded by the fact that (a) while you want a keyword for
>> prescription, discovery of unique pointers can be entirely automatic, and
>> (b) Rust advocates the wrong default (owned pointer), and (c) for most
>> purposes it's completely unnecessary. Owned pointer is the wrong default
>> because we teach general-purpose programming, and the semantics of an owned
>> pointer isn't general purpose.
>>
>
> This sounds like a slippery slope which would turn Rust into yet another
> GC language. I don't really understand the value in that. C/C++ have
> utility beyond GC languages because they don't admit GC, but they are
> filled with safety holes. Rust is trying to plug many of those safety holes
> while still being free of the trappings of GC. It may not be the path of
> bitc, but it sounds like a worthy goal to me, at least until bitc gives us
> a tradeoff free GC model.
>

David, I think it's clear that you feel very strongly opposed to GC in some
circumstances, and that's a perfectly OK choice for you to make. I don't
share your view. I don't think that there will *ever* be a GC solution that
is completely free of tradeoffs, just as there isn't a manual solution that
is completely free of tradeoffs. For some applications, GC is notably *
faster* than manual collection.

Let me come back to this, though, because your comment raises a point about
library design that I think is worth digging in to.

>
>
>> 4. *Programmers don’t know which to use, since some operations are
>> available with ~ and some operations are available with @*. Well hmm.
>> It's not clear to me what operations aren't supported on both pointers,
>>
>
> A linked list designed to avoid GC-tracing takes an owned pointer version
> of it's payload. If it accepts a GC pointer to it's payload, then the
> entire linked-list must be traced to trace those embedded GC pointers --
> negating the benefit of using owned pointers to begin with. This analogy
> extends to any owned-pointer-capable data-structure implementation.
>

I'm very skeptical about owned pointers in collections. It becomes very
difficult to correctly *use* those pointers in certain circumstances. In
essence, you get pushed toward idioms in which all "copies" of the pointers
are borrowed copies, and depending on how owned pointers actually work,
insertion during list construction can be a pain.

I'm not arguing against ownership or against owned lists here. I'm just
saying that the idiom issues for owned pointers are tricky.

I definitely do see that the notion of owning collections is powerful, and
that it has significant advantages in some cases.

The other thing I'd note about this is that a mostly-owned object graph
that contains even one GC'd reference has a way of needing to be traced,
though one can imagine ways to direct the tracing such that only the traced
cells need to be visited.

>
>> Within point 4, there is a key mis-statement:
>>
>> The key insight that was missing is that *the owning pointer ~ is just
>> the Rust equivalent of malloc and free*.
>>
>>
>> This statement is flatly wrong, because the two aren't equivalent at all.
>> Rust's owning pointer is a degenerate region allocation (degenerate because
>> they don't - or at least didn't - have a proper region system). That's not
>> the semantics of malloc/free at all.
>>
>
> Why did-you/do-you think this? By my view owning pointers are a form of
> linear-types, and borrowed-pointers and lifetimes are an attempt to extend
> the scope within which Rust can share direct references to linear-type-ish
> owned-pointers.
>

So per my recent question, I'm not sure at the moment whether owned
pointers are linear types or a dynamic approximation to linear types, but
it doesn't really matter.

Perhaps I was quibbling about the word "equivalent". I think the statement
"owned pointers are what Rust provides *instead of* malloc and free" is
accurate. I also believe that if we set aside things that want cyclic data
structures, owned pointers probably cover most *correct* uses of
malloc/free. Caveat: Appel did some measurements on false liveness that
suggests 20% of the live object set is falsely live. That is: the root
pointers that caused those objects to be live in the eyes of the collector
were no longer reachable by the program. I think it's reasonable to expect
similar issues from owned pointers.

If I understand region systems at all, they don't in themselves establish
> liveness for objects in the region.
>

Regions establish a *bound* on liveness, but depending on circumstances it
may or may not be a tight bound. This is also true for malloc and free
(free often occurs later than it could), and it is also true for stack
allocation (barring tricky optimizations, a non-live object on the stack
continues to occupy stack space).

In actual practice, there are many region patterns in which liveness (or
more precisely, the absence of operations that might cause the live set to
shrink) can be inferred.

> The region must be GCed if one wishes to reclaim elements no longer
> referenced in a dynamic region.
>

Not necessarily. The region must be GC'd if (a) there is reason to believe
that it's live set may have shrunk, and (b) the amount of space you stand
to reclaim makes the GC worthwhile. The alternative is to wait until the
region scope exits, at which point you (eagerly or lazily, your choice)
reclaim the entire region. A mix of strategies is possible.

Note, however, that collecting a region is *much* easier than collecting a
heap. Only regions *younger than* the region under collection can contain
pointers into the collected region. It's a lot like generational
collection, except without the possibility of pointers that cross
generation boundaries in the wrong direction. It also tends to be a
mainly-in-cache operation.

In some sense this is bad, because we're now considering the possibility
that *some* GC's could be OK even for the applications that concern you. It
was much easier to know what to do (language wise) when the rule was more
black-and-white than that.

shap

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] Rust, GC, and language politics

Reply via email to