On Thu, Aug 22, 2013 at 12:20 PM, David Jeske <[email protected]> wrote:
> On Thu, Aug 22, 2013 at 11:12 AM, Jonathan S. Shapiro <[email protected]>wrote: > >> 1. We're going to have a byte-code form at a level of abstraction similar >> to that of CLR. Which means we aren't limited to static or dynamic linking >> as our means of dealing with versioning. For better or worse, any language >> that combines generics/parametrics with unboxed types is going to need this. >> > > Agreed. I think a subtlety is that we'd like this > intermediate-representiation to not be trivially decompilable into full > source -- which I think at least means replacing non-exported > human-readable symbols with something else. > You know, if we slap enough extraneous requirements on this problem we'll end up building a Joint Strike Fighter. :-) This is a perfectly fine problem for an after-market obfuscator. I've seen some work on *symbolic* disassembly of stripped binary code. It's amazing how little an obfuscator actually buys you. > 1. Link-and-load: a relocation-style rewriting pass that is not as complex >> as JIT. The main limitation of this is emitting procedure calls having >> unboxed parameter types, mainly because that doesn't mesh well with the >> underlying calling conventions on most machines. Fortunately we only have >> to respect those calling conventions at points where cross-language calls >> occur. >> > > This is close to the strategy of Objective-C + ilvar, though they offload > the method call polymorphism to runtime dispatch. This seems also remotely > like the approach of WinRT/CXX. > Is there some place I can read up on that? > However, this link-and-load approach encourages a fight between > polymorphism and performance. This is the wedge that's driving JIT into > modern systems, as we desire both. > I think that's a potential issue, but I go back and forth on how serious it is. My suspicion is that if you can get aggressive inlining at ground types - and I think you can - then most of the rest isn't that bad. You aren't going to get the ideal degree of inlining, but with virtual functions you wouldn't get that in any case. This is one of those sorts of issues where intuitions can turn out to be wildly off, and the only real way to know the overhead is to do the work and measure the result. > > >> 2. Post-install AOT: When a new library arrives we compile it with AOT. >> We keep the old version of the library. We then run around recompiling all >> of the things that depend on that library one at a time. When all of those >> have been compiled, we drop the old version of the library. As processes >> halt and restart in the normal order of things, the old version of the >> library disappears. >> > > I presume the recompile of "things that depend on the library" can then > inline as needed? > Yes. In fact, if you look at this as a "staged AOT" sort of thing, you can extract much of the benefit of whole-program compilation. In particular, if you can determine that the program doesn't do dynamic loading, you have a whole-program picture of which types get instantiated. That lets you turn a whole lot of calls that would otherwise be virtual calls into static procedure calls, which can then be inlined. It's also enough to let you do optimizations across procedure call boundaries. If you can see, for example, that g() is only called from f() in this program, then you would go ahead and inline g() even if it wouldn't otherwise meet the "should I inline" heuristics. > My naive view is that determining what to inline without call frequencies > in a very polymorphic interface-based world is a hard problem... > In the link-and-load approach, you don't have the mechanism to do late inlining in any case. You can inline most block move operations, which helps, but that's about it. You can also inline anything where the concrete type is fully known at static compile time - that covers most ground types. In the install-time AOT approach, you can do all of the usual *static* inlining decisions, and then you can do some others that are enabled by whole program (as I outlined above). I actually don't know of any quantitative measurements about the further inlining benefit that can be realized when dynamic trace information is available, but I don't see anything that would (technically) preclude the collection of those traces. I see a *lot* of reasons why trace-based compilation is problematic from a security perspective. > By my view the line between AOT and JIT is a fuzzy one. Some modern > systems make use of runtime code generation (regex, sql, etc), if this is > supported then JIT is effectively supported. > I think you mean "it's fuzzy if you can get trace information and re-run the AOT mechanism". I *almost* agree. The thing you still won't get out of that is optimizations that rely on commonly used run-time values that can be JITted as special cases. But a lot of those seem to be obtainable using static metacompilation in the style of Engler's "tick C" ('cc). > > >> The main problem that the second approach doesn't solve is run-time >> loading of *unregistered* code. >> > > This sounds like an over-constraint. A Photoshop plugin could easily be > "run-time registered" into a running program, "AOT compiled" while the > program waits, and then loaded. As long as there is no type-definition > collision, the difference between AOT and JIT here is just time-delay and > inlining differences. > I think I wasn't clear what I meant by "unregistered". The registry information I'm thinking about is merely a record that "program X loads assembly Y". Yes, you can build that dynamically if the install database doesn't give you the information you need, but I think you are losing track of what problem we were trying to solve, which was version updates. What * you* are saying is that we can run AOT on assembly Y as late as load time. That's correct. But the problem we were originally trying to solve was that a version change on assembly Y requires a recompile of the *loading* program X. We obviously cannot do that while X is running. > > What the second approach does not admit is run-time registration and > loading of a *new-version* of an existing registered assembly -- since that > would require stopping and recompiling the application. > Yes. More precisely: a new version of a registered assembly such that the portion of the interface consumed by the loading program has actually changed. If you are really designing for dynamic loading, you work pretty hard to keep that interface type stable, because (as you say) type-redefinition is a mess of problems. > IMO, the only additional benefit of JIT is faster launch for previously > unknown code... > JIT also has significant advantages for debugging. shap
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
