Re: [bitc-dev] Rust, GC, and language politics

Bennie Kloosteman Mon, 29 Jul 2013 06:56:08 -0700

I probably share Davids views in some ways but for different reasons . I
think withotu a major backer you cant practically build a good enough GC /
system so you need to evolve from a hybrid  , this has the advantage that
the language is usefull for hard core systems programming where there is a
clear need to replace C and C++.


Agree rust picked the wrong pointer as default but that is almost certain
to happen from a C / C++ team.


*"Survival of a Small Subset* Just as Rust wants to hand object graphs
across Task boundaries, a region-based system can end up with region bloat
by virtue of object return. The pattern is that "we build a bunch of things
on the way to returning just a few of them, but the fact that *some* are
returned causes the region constraints to propagate to everything we built,
with the result that the intermediate garbage cannot be collected eagerly
on return.

The only answer to this in a region-based system is to deepcopy the
subgraph being returned - which is not necessarily a bad thing to do. The
reason you have to copy it is driven at the type level by breaking the link
between regions, but also at the implementation level: you want region
reclaim to happen in bulk at the level of the allocator, which means that
objects in different regions should not be commingled in a single memory
"chunk".

ARC isn't really a good answer. The problem with ARC in this situation is
that it leads to heap fragmentation in the same way that malloc/free do.
The net effect of this is that an object allocated through ARC has to use a
completely different *kind* of allocator, which is significantly more
expensive than a GC-style allocator. In a surprising number of cases
(though not in all), deepcopy-on-return will be faster than ARC, and has
the added advantage that it can handle cycles. Note that the deepcopy can
be intelligent, in that we know the purpose of deepcopy is for
region-driven object survival, and we don't need to copy *or trace* objects
that are already in the desired output region."

Rust  doesnt just hand object graphs , by default most passing is  by value
with the GC pointers being less used . Yes passing by deep copy is a good
thing and far too little used the amount of CPU lost in multithreaded
contention is shocking  ..

Personally i think we are no where near to a solution yet of running cross
task  comms on a shared GC .. Singularity had a good go at it . Personally
i would offer

- Pass by value
- Pass by deep copy
- Reference counting  ( agree with comments above but can be useful for
cases where resources  are held   and there are few allocations , it just
cant be the general whole heap case unless its a short lived task )
- pass by buffer ( eg a buffer service with a byte[] with specific locking
AccessBuffer , TransferBuffer , ReleaseBuffer.
-  global GC  ( of various types pauseless , xmpExchange incerement for
fast allocating )

This  gives control to the developer who can use what is appropriate eg in
most cases just a mix of GC and pass by value and deep copy but the other
cases can be usefull to reduce stress on the GC and will significantly
reduce heap sizes. Most important its flexible and what ever techniques
work will become dominent.

Ben



On Mon, Jul 29, 2013 at 4:52 PM, Jonathan S. Shapiro <[email protected]>wrote:

> So David seems to be very interested in building libraries that don't GC.
> Let's first try to get the requirement stated more precisely, and then
> let's see if we have a solution for it.
>
> I think it's fair to say that we want the *semantics* of object death to
> be the same whether object reclamation occurs eagerly or lazily. For
> example, I don't think we want distinct kinds of finalizer logic for
> eagerly and lazily collected objects.
>
> I also think it's fair to say that we're not so much concerned with GC as
> we are concerned with knowing an upper bound on GC latency and ensuring
> that bound is acceptable for our application. In an application where
> objects may have finalizers, there is no way to drive the bound to zero,
> because object reclamation (either eager or lazy) requires the execution of
> user-defined code. So really, we're concerned with the latency to (a)
> discover the live set, and (b) discover that subset of the deceased object
> set that requires finalization.
>
> Regions *almost* solve this. We can certainly ensure that all of the
> allocations of interest occur within a known region, and we can very easily
> be explicit about when that region is reclaimed. There are two cases that
> will not address that owned pointers *can* address:
>
> *Reduction of Liveness*. If we have an existing non-borrowed object
> reference, and that object reference is overwritten, then the live set for
> the target region has potentially gotten smaller. Until this occurs, no
> profit can be had from GC. Note that in region-based allocation, liveness
> does not shrink on a per-object basis when the stack shrinks - that
> operation causes liveness to shrink on a per-region basis, which doesn't
> require a trace operation prior to reclamation. This is something we can
> check easily, and infer in most cases.
>
> *Survival of a Small Subset* Just as Rust wants to hand object graphs
> across Task boundaries, a region-based system can end up with region bloat
> by virtue of object return. The pattern is that we build a bunch of things
> on the way to returning just a few of them, but the fact that *some* are
> returned causes the region constraints to propagate to everything we built,
> with the result that the intermediate garbage cannot be collected eagerly
> on return.
>
> The only answer to this in a region-based system is to deepcopy the
> subgraph being returned - which is not necessarily a bad thing to do. The
> reason you have to copy it is driven at the type level by breaking the link
> between regions, but also at the implementation level: you want region
> reclaim to happen in bulk at the level of the allocator, which means that
> objects in different regions should not be commingled in a single memory
> "chunk".
>
> ARC isn't really a good answer. The problem with ARC in this situation is
> that it leads to heap fragmentation in the same way that malloc/free do.
> The net effect of this is that an object allocated through ARC has to use a
> completely different *kind* of allocator, which is significantly more
> expensive than a GC-style allocator. In a surprising number of cases
> (though not in all), deepcopy-on-return will be faster than ARC, and has
> the added advantage that it can handle cycles. Note that the deepcopy can
> be intelligent, in that we know the purpose of deepcopy is for
> region-driven object survival, and we don't need to copy *or trace* objects
> that are already in the desired output region.
>
>
> I said in another note that Rust picked the wrong pointer default, and
> that the sensible behavior should be "infer the regions you can, and GC
> everything else so that pointers appear to have the expected general
> behavior". My rationale for this is that you want to keep things simple for
> beginners, and escalate conceptual complexity only where it has payoff.
> Designing *good* libraries is surprisingly hard to do; it's not an
> activity done by everyday programmers. Better to defer confusing
> complications until they are motivated. Even when an advanced programmer is
> building a library, it may make sense to defer dealing with allocation
> strategy until the algorithms are clearly understood.
>
> So for the kinds of libraries David is concerned about, my proposal would
> be to first build the library without concern for GC, and then annotate all
> of the procedure types at the library interface with something like a *
> NoGarbage* effect. A procedure satisfies "NoGarbage" if (a) the only live
> object at the point of return is the one returned by the procedure, and (b)
> no tracing is performed or induced by the execution of the procedure. If
> you want this to be a property of the entire library, I don't see any
> reason not to stick effects on interfaces, with the intended meaning that
> they apply to all functions in the interface.
>
> A procedure is "memoryless" if (a) it is a NoGarbage procedure, and (b)
> the region of the return result is the caller region, or a region
> explicitly supplied by the caller.
>
> Now when you add this effect constraint on the interface, what happens is
> that the compiler now has to propagate the effect and make sure all of your
> code satisfies the effect. And of course some of it won't, and now you
> start getting messages about all sorts of places where the compiler's
> attempt to do smart region inference didn't get you what you wanted. And
> those you go fix.
>
>
> In any case, I think that the compiler can successfully infer a lot of
> cases where region-based reclaim is possible and sensible, and it can
> identify and provide useful diagnostics in the cases where it is not.
>
>
>
> shap
>
> _______________________________________________
> bitc-dev mailing list
> [email protected]
> http://www.coyotos.org/mailman/listinfo/bitc-dev
>
>

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] Rust, GC, and language politics

Reply via email to