Re: [bitc-dev] LLVM vs CLI revisited

Jonathan S. Shapiro Wed, 08 Jan 2014 08:05:46 -0800

Ben:

These are some good catches. There are four different classes of issue here:


1. Type representability
2. GC
3. Type enforcement
4. Runtime

You also mention SIMD and auto-vectorization. Those concerns are a ways off
and we shouldn't let them distract us right now.


Concerning type representability: CLR is "OK enough" on unions. Internal
references can be handled, but in CLR they may force us to move values from
the stack to the heap in order to comply with CLR's safety checks.

Concerning GC: you are right that CLR (not Windows) dictates the choice of
GC. That is unfortunate, but non-semantic.

The answers on the rest are (regrettably) complicated.

We are contemplating what might be termed "whole assembly" compilation.
Within an assembly we can implement and enforce any type system we want. We
*may* (or may not - see below) have to do type erasure at assembly
boundaries. For dependent types, this means inserting checks on parameters
at assembly entry and return points.

There are clearly safety checks we can perform that CLR cannot verify.

Which brings me to the interesting one: linkage.  A lot depends on whether
we can rely on the linker to enforce type discipline at assembly
boundaries. You mentioned that LLVM requires the VC++ linker on Windows
platforms. I don't think it really matters.

There are three compilation models to consider: whole program, JIT, and
dynamic linked.

*Whole Program*

When we do whole-program compilation, we can use the native linker cleanly.
All we need in that scenario is link-time name resolution, which is really
no big deal. In that compilation model, we can check and enforce pretty
much any property we want. We gain essentially zero benefit from any safety
checking provided by the runtime layer. When targeting CLR, we either need
to target CLR-unsafe code or we need to restrict ourselves to things the
CLR's verifier can check. When targeting LLVM, we deal with a range of GC
issues. In either case we can live within the constraints of the native
object file format.

So that's a good solution for about two real-world programs and a lot of
early test cases. ;-)

*JIT*

The second possible compilation model is to do checking and code generation
at load time. In that case we are effectively building a JIT system. Even
if much of the compilation is AOT, we still need to be able to instantiate
previously unseen expansions of polymorphic procedures. If we truly build
our own JIT system, then we are not limited by the underlying linker or
object file format and we can check anything we want. If we try to do this
by loading CLR code, we'll have to see how much of this can be done in a
custom class loader. CLR also raises the issue that we need to figure out
where to stick supplemental type information that CLR cannot express.

In effect, the JIT engine subsumes the entire linker issue.


*Dynamically Linked*

The dynamic linkage scenario is a mid-point between whole-program and JIT.
We build a resolving compile phase that picks up all of the assemblies and
figures out what code has to be produced AOT in order for them to bind
successfully. That code goes into a supplemental DLL that ships with the
"main" assembly, or possibly into the main assembly itself.

As with JIT, we need to figure out where in the object file format we will
stick supplemental type information. We are effectively doing whole-program
here, but we're going to effectively write the assemblies out as fragments
that will be stitched back together by the dynamic linker.

The problem with this model is that it doesn't support run-time *loading* very
well. The same scheme needs to work when compiling DLLs, and avoiding
collisions in the supplemental code is a bit tricky.



I am concerned that BitC object modules will rarely be able to use the
native object file format - or at least not in the way it was intended to
be used. The ELF format is flexible enough that I can almost imagine
carrying the type information we need in an additional ELF section that the
linker could then ignore. PECOFF doesn't appear to have a comparable level
of flexibility, so figuring out where to put the type information presents
a bit of a puzzle.


Anyway, I hope it's at least clear why the fact that we can't really use
the native linker doesn't scare me. We're *way* past that! :-)


shap

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] LLVM vs CLI revisited

Reply via email to