Re: [bitc-dev] Runtimes where binary shared libraries can be compatibly updated

Jonathan S. Shapiro Thu, 22 Aug 2013 10:51:29 -0700

On Wed, Aug 21, 2013 at 11:32 PM, David Jeske <[email protected]> wrote:

> On Wed, Aug 21, 2013 at 10:15 PM, William ML Leslie <
> [email protected]> wrote:
>
>> ...the only things that matter are the object code and the declared
>>
>> interface.  Is there anything about this that I'm missing?
>>
>
> I'd flip that around and say that what matters is that for the type of
> interface you would like to express and an interesting set of library
> changes, a new binary object can be supplied for which applications
> compiled against the old binary can run-time link and run against the new
> binary.
>

That's an over-stated requirement. Whether the mechanism is run-time
linking or recompile-on-first-launch or something else is a separate matter
from the technical compatibility issue.

> C's ABI compatibility model is only sufficient for the kinds of simple
> things expressed in C APIs. More complex system libraries and interfaces
> have since moved past the binary-compatibility capabilities of C-style
> ABIs.  Some of those capabilities include Generics, Interface programming,
> parametric instantiation, and automatic memory management.
>

Yes. And in consequence, *all* of these more advanced systems have
effectively abandoned static or run-time linking as the abstraction layer
at which compatibility is addressed. Conceptually, they have fallen back to
whole-program compilation from a high-level intermediate form, and then
implemented that compilation process incrementally in the form of a JIT
compiler. There are a whole *bunch* of ways that this makes things much
simpler. In particular, there is a long laundry list of messy problems
(e.g. fragile base class) that disappear entirely if the compatibility
solution has access to type information and is able to produce new code
rather than adjust old code.

I don't think it's accurate to characterize this as a modular solution.
There is a surface appearance of modularity at the stage where source units
are compiled to byte code. While the byte codes themselves look similar to
instructions and admit a straightforward interpreted implementation, that
code is still a relatively abstract high-level form. In the presence of
generics, for example, various sorts of offsets and function address aren't
known at byte-code compile time. It's not just that a resolution pass in
the style of a relocating link has to be done. You don't really even know
the sizes and alignments of structures in many cases.

>
> Pulling away from these details, the fact remains that there is no
> successful model for having two different GC tracing systems coordinate to
> determine reachable resources. This forces us to express all boundaries out
> of the GC systems in terms of non-tracing memory management contracts.
>

That's not quite true, but it's nearly true. Singularity stands out as a
counter-example.

The interesting cases here are the cases in which two "processes" running
in separate heaps want to share an object by reference. It isn't hard to
register "foreign" roots for such objects, and it isn't that hard to make
the respective GC's concurrent enough that the mark passes don't step on
each other. The practical challenge turns out to be the object header
structure. Different styles of GC tend to want different information in the
object header. When cross-heap references occur, sanity of implementation
tends to push you toward a *common* object header.

But the practical impact of this is pretty limited, because the data that
can *usefully* be shared across processes tends to be "mostly reference
free", and the number of heap-exiting pointers involved for any given
process tends to be small. The collector has to deal with foreign pointers
anyway (e.g. pointers to native objects), so it already needs to recognize
pointers to things outside the heap. It's therefore possible to imagine
designs in which a dedicated "shared" heap is used, and the shared heap
uses a compromise object header structure that is commonly handled by all
participating GCs. This is effectively what Singularity does.

The limitation in that design is the problem of copy avoidance. In
practical systems, most of the benefit of cross-heap object sharing is a
result of copy avoidance. To achieve copy avoidance in the Singularity
shared heap style of design, you need to plan ahead and build the object in
the shared heap in the first place. When this requirement is combined with
linear types (which is how Singularity managed content in the shared heap),
the result is very hard to use. There are *many* codes that would like to
operate on things in the shared heap that have fork/join patterns in their
control flow. Linear types don't tolerate such control flow. Network
stacks, as one example, involve lots of these.

So in practice, the problem in Singularity had nothing to do with GC at
all. It had to do with the decision to use a typing mechanism whose
limitations were a bad impedance match for the problems people were
actually trying to solve. I wonder, in hindsight, whether some combination
of ARC and borrowed pointers would not have worked better.

> My conclusion is that we should continue to evolve one or two very capable
> cross-language runtimes, like CLR, because things are easier if we only
> have one managed runtime in the mix.
>

That's a good plan if it can be accomplished. You are hereby appointed as
the Microsoft liason....

shap

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] Runtimes where binary shared libraries can be compatibly updated

Reply via email to