Re: [bitc-dev] Runtimes where binary shared libraries can be compatibly updated

Jonathan S. Shapiro Thu, 22 Aug 2013 14:56:26 -0700

On Thu, Aug 22, 2013 at 12:20 PM, David Jeske <[email protected]> wrote:

> On Thu, Aug 22, 2013 at 11:12 AM, Jonathan S. Shapiro <[email protected]>wrote:
>
>> 1. We're going to have a byte-code form at a level of abstraction similar
>> to that of CLR. Which means we aren't limited to static or dynamic linking
>> as our means of dealing with versioning.  For better or worse, any language
>> that combines generics/parametrics with unboxed types is going to need this.
>>
>
> Agreed. I think a subtlety is that we'd like this
> intermediate-representiation to not be trivially decompilable into full
> source -- which I think at least means replacing non-exported
> human-readable symbols with something else.
>

You know, if we slap enough extraneous requirements on this problem we'll
end up building a Joint Strike Fighter. :-)  This is a perfectly fine
problem for an after-market obfuscator. I've seen some work on
*symbolic* disassembly
of stripped binary code. It's amazing how little an obfuscator actually
buys you.

> 1. Link-and-load: a relocation-style rewriting pass that is not as complex
>> as JIT. The main limitation of this is emitting procedure calls having
>> unboxed parameter types, mainly because that doesn't mesh well with the
>> underlying calling conventions on most machines. Fortunately we only have
>> to respect those calling conventions at points where cross-language calls
>> occur.
>>
>
> This is close to the strategy of Objective-C + ilvar, though they offload
> the method call polymorphism to runtime dispatch. This seems also remotely
> like the approach of WinRT/CXX.
>

Is there some place I can read up on that?

> However, this link-and-load approach encourages a fight between
> polymorphism and performance. This is the wedge that's driving JIT into
> modern systems, as we desire both.
>

I think that's a potential issue, but I go back and forth on how serious it
is. My suspicion is that if you can get aggressive inlining at ground types
- and I think you can - then most of the rest isn't that bad. You aren't
going to get the ideal degree of inlining, but with virtual functions you
wouldn't get that in any case. This is one of those sorts of issues where
intuitions can turn out to be wildly off, and the only real way to know the
overhead is to do the work and measure the result.

>
>
>> 2. Post-install AOT: When a new library arrives we compile it with AOT.
>> We keep the old version of the library. We then run around recompiling all
>> of the things that depend on that library one at a time. When all of those
>> have been compiled, we drop the old version of the library. As processes
>> halt and restart in the normal order of things, the old version of the
>> library disappears.
>>
>
> I presume the recompile of "things that depend on the library" can then
> inline as needed?
>

Yes. In fact, if you look at this as a "staged AOT" sort of thing, you can
extract much of the benefit of whole-program compilation. In particular, if
you can determine that the program doesn't do dynamic loading, you have a
whole-program picture of which types get instantiated. That lets you turn a
whole lot of calls that would otherwise be virtual calls into static
procedure calls, which can then be inlined. It's also enough to let you do
optimizations across procedure call boundaries. If you can see, for
example, that g() is only called from f() in this program, then you would
go ahead and inline g() even if it wouldn't otherwise meet the "should I
inline" heuristics.

> My naive view is that determining what to inline without call frequencies
> in a very polymorphic interface-based world is a hard problem...
>

In the link-and-load approach, you don't have the mechanism to do late
inlining in any case. You can inline most block move operations, which
helps, but that's about it. You can also inline anything where the concrete
type is fully known at static compile time - that covers most ground types.

In the install-time AOT approach, you can do all of the usual *static* inlining
decisions, and then you can do some others that are enabled by whole
program (as I outlined above). I actually don't know of any quantitative
measurements about the further inlining benefit that can be realized when
dynamic trace information is available, but I don't see anything that would
(technically) preclude the collection of those traces. I see a *lot* of
reasons why trace-based compilation is problematic from a security
perspective.

> By my view the line between AOT and JIT is a fuzzy one. Some modern
> systems make use of runtime code generation (regex, sql, etc), if this is
> supported then JIT is effectively supported.
>

I think you mean "it's fuzzy if you can get trace information and re-run
the AOT mechanism". I *almost* agree. The thing you still won't get out of
that is optimizations that rely on commonly used run-time values that can
be JITted as special cases. But a lot of those seem to be obtainable using
static metacompilation in the style of Engler's "tick C" ('cc).

>
>
>> The main problem that the second approach doesn't solve is run-time
>> loading of *unregistered* code.
>>
>
> This sounds like an over-constraint. A Photoshop plugin could easily be
> "run-time registered" into a running program, "AOT compiled" while the
> program waits, and then loaded. As long as there is no type-definition
> collision, the difference between AOT and JIT here is just time-delay and
> inlining differences.
>

I think I wasn't clear what I meant by "unregistered". The registry
information I'm thinking about is merely a record that "program X loads
assembly Y". Yes, you can build that dynamically if the install database
doesn't give you the information you need, but I think you are losing track
of what problem we were trying to solve, which was version updates. What *
you* are saying is that we can run AOT on assembly Y as late as load time.
That's correct. But the problem we were originally trying to solve was that
a version change on assembly Y requires a recompile of the *loading* program
X. We obviously cannot do that while X is running.

>
> What the second approach does not admit is run-time registration and
> loading of a *new-version* of an existing registered assembly -- since that
> would require stopping and recompiling the application.
>

Yes. More precisely: a new version of a registered assembly such that the
portion of the interface consumed by the loading program has actually
changed. If you are really designing for dynamic loading, you work pretty
hard to keep that interface type stable, because (as you say)
type-redefinition is a mess of problems.

> IMO, the only additional benefit of JIT is faster launch for previously
> unknown code...
>

JIT also has significant advantages for debugging.

shap

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] Runtimes where binary shared libraries can be compatibly updated

Reply via email to