On Thu, Feb 24, 2011 at 7:23 AM, Stephen Bannasch <stephen.banna...@deanbrook.org> wrote: > Charlie, can you give me a bit of context. > > " I'd like to have indy be competitive with my dynopt logic, > since they are structurally identical" > > Remind me what's the difference between indy and dynopt? > > Which one relates to what will be in java 1.7 when it is first released? > > I remember JRuby previously running faster using mlvm.
Perhaps this is useful for the whole list. Over the years, we've gone through many dispatch mechanisms in JRuby. I'll describe the ones currently available first. 1. Normally JRuby still dispatches all calls via org.jruby.runtime.CallSite subclasses, generally via some subclass of CachingCallSite (https://github.com/jruby/jruby/blob/master/src/org/jruby/runtime/callsite/CachingCallSite.java). CachingCallSite is a monomorphic cache, holding a tuple of an integer and a JRuby "DynamicMethod" reference. DynamicMethod is basically our method handle. The integer is derived from a "serial number" of the class we cached from at the time of caching. Guarding the site then is a matter of comparing the incoming receiver's class's serial number with the cached serial number. Invocations from Ruby code pass *through* the call site via one of the CallSite.call methods, which defeats inlining on current JVMs, so we're not achieving the best perf possible. 2. A second mode uses largely the same mechanism, but instead of calling *through* CallSite it pulls the DynamicMethod all the way back to the Ruby call site in JVM bytecode and invokes it there. This allows the target call to inline in many cases (since we generate a unique DynamicMethod handle class per Ruby method), but because DynamicMethod itself introduces a lot of extra logic and extra stack frames, it doesn't inline as well as a direct call could. 3. The newer "dynopt" mechanism uses the same guard mechanism again, but instead of dispatching through CallSite or DynamicMethod it often dispatches *directly* to the target Java method. Essentially, if the method cached by the interpreter has a JVM method somewhere, dynopt emits the guard with the success path making a direct invokevirtual or invokestatic to the target JVM method, and the fail path using CallSite logic. This allows code to inline extremely well; "fib" and "tak" performance nearly match Java performance (for a Java version that also uses RubyFixnum objects) and recursive calls inline straight through. But inlining other Ruby methods is trickier, since we usually load them into their own classloaders (and it's not possible to emit invokevirtual or invokestatic calls to methods loaded in sibling classloaders). 4. The recent work on invokedynamic leverages the dynopt work. The guard is again the same as in the above three cases; I have not yet explored MutableCallSite.sync for actively invalidating (thereby eliminating the guard). The success path from GWT is one of two pieces of logic: either a direct handle to the target method, just like dynopt; or a handle to a DynamicMethod object. The eventual goal is for dispatch to be all direct handles with any logic currently in DynamicMethod moved into additional handles decorating the target. So back to your original question... I expect (hope?) that invokedynamic will eventually perform as well as dynopt because: * Both use the same guard logic * Both have direct paths to target methods Logically, both mechanisms are identical, with the only real difference being that dynopt is essentially "pre-inlined" invokedynamic logic. Currently, mechanism 4 (invokedynamic as it is used in JRuby today) is sometimes faster than mechanism 1 (CallSite), but comes nowhere near mechanism 3 (dynopt). I would provide numbers, but I don't have Christian's fixes for recursive indy calls. And of course, algorithms like redblack are still having some perf issues with indy, running about 2x slower than standard CallSite-based dispatch. Excluding indy for now, here are numbers for 1, 2, 3 above plus a version running with dynopt and no guards (in theory, what I would expect from indy + MutableCallSite.sync to eliminate guards there). I use fib(35) here because it shows well how inlining helps performance. https://gist.github.com/842528 - Charlie _______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev