John Rose wrote: > On Aug 28, 2008, at 9:51 AM, Charles Oliver Nutter wrote: > >> With more research and time and unboxing smarts, it could approach >> Java perf. >> > > The invokedynamic instruction removes signature-related bottlenecks from > dynamically typed JVM calls, allowing dynamic calls to be compiled with > any desired signature, not just reflective style (varargs with boxed > primitives) or Smalltalk style (fixed argument count, with boxed > primitives). It also provides a vm-supported way to plug-and-play with > method references, which until now everybody had to invent for > themselves (since java.lang.reflect.Method does too much). > > So if the dynamic language backend can do enough type inference to make > calls "optimistically" with some unboxed arguments, and if the call > target actually accepts a signature that is close to the caller's > signature, a cheap adapter can be inserted by the runtime, and the call > can go through with a minimum of data motion. If the signatures match > exactly (or closely enough so that no dynamic checks are needed to > verify data integrity), then the call can be as direct as a normal Java > call, and can be inlined just the same.
I literally get goosebumps every time I think about this. We keep track of compiler logs in JRuby, watching for inlining opportunities, "too big" methods, and exploring ways to reduce trap occurrences, but even the cleanest, simplest benchmarks never have enough room to inline more than one dynamic recursion, even with all we've done to reduce the complexity of the call path. The promise of invokedynamic is that we'll be able to remove almost all of that dynamic dispatch logic from all calls and give HotSpot the "keys to the castle" right at the call site itself. That's really huge. > Also, because the JSR 292 JVM supports method handles directly, there > can be close coupling between the handle used by the dynamic runtime and > the underlying "real" method, to the point where optimizers routinely > "see through" the handle to the method. Because of the complexity and > semantic mismatch of java.lang.reflect.Method, this close coupling has > been rare in the past. I'm also absolutely thrilled that both handles and anonymous classloading are there, with or without invokedynamic, since largely the limiting factor for continued JRuby optimization has been the gross overhead incurred by generating and regenerating bytecode in the form of call adapters, custom call sites (like DLR's DynamicSite) and type/arity-specific invocation handles. I hope this will finally free us from those bonds. > The statefulness of invokedynamic (as seen through CallSite.setTarget) > is intended to be exactly enough for the JIT to process dynamic call > sites statically. It can optimistically inline the method handle at > every invokedynamic call site, and take corrective action if a new > target comes along and invalidates inlined code. We've been doing this > optimization in HotSpot for years, where the same statefulness is in the > class hierarchy. (Did an overriding method just get loaded? Gee, gotta > recompile.) Now the same state change hazards will also be in each call > site, where dynamic languages need the help. And for those of us who have struggled with our own call-site invalidation mechanisms (struggled as in bashed our heads against the walls to get a good combination of fast code paths, simple guards, concurrency, and accuracy all to line up), this is more welcome news. > By itself, invokedynamic is not a full replacement for fixnums (small > integers packed into a tagged pointer), since (for example) it doesn't > help you create a list of ints without boxing. But it does provide full > speed paths for important functions like generic arithmetic and sequence > references (where the index types are often plain ints, and do not > benefit from boxing). Boxed integers in the JVM are of reasonable > performance now, and can be improved transparently if we put in fixnums > some day (regarding which see my blog post on fixnums). As you know, I'm interested in fixnums. One of the up-and-coming Ruby implementations is based on Gemstone's Smalltalk VM, which I presume already has true fixnums. As a result of this, they basically smoke all other Ruby implementations when it comes to fixnum benchmarks. And I think largely the cost in JRuby comes from two things: fixnum being a boxed primitive type, and a call path that's too long for hotspot to see it's just a boxed primitive type. Ultimately, the ability to wire in a fixnum type that's just a tagged int would push us well over the line for math performance, without the need to resort to really ugly tricks (over-hacks, in my opinion) to figure out when we can hold the JVM's hand and unbox for it. - Charlie --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "JVM Languages" group. To post to this group, send email to jvm-languages@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/jvm-languages?hl=en -~----------~----~----~----~------~----~------~--~---