[jvm-l] Re: Strategies for optimizing closure invocations

Kresten Krab Thorup Sun, 03 Apr 2011 08:38:21 -0700

I would think that this is a case where a vm has a chance to shine.
Your "eachCommon" just needs to be inlined into the callsite, and the
call to yield becomes monomorphic. That's the primary optimization
that makes the JVM run fast.


So. If method handles or something else disables this optimization, or
makes it more expensive then perhaps we should consider adding a hint/
annotation to the vm saying "it may make sense to inline this". but I
would think that that is exactly the kind of assessment that the vm
can do better than static knowledge allows.

Kresten

On Apr 2, 7:40 pm, Charles Oliver Nutter <[email protected]> wrote:
> I'm getting deeper into JRuby optimization lately, and starting to
> ponder strategies for optimizing closures (or, in the case of Clojure
> and Scala, optimizing pass-through methods that make megamorphic
> callbacks to functions).
>
> First, some explanation...
>
> In Ruby, the "each" method on Array receives a block of code:
>
> [1,2,3].each {|i| do_something_with i}
>
> "each" is implemented as a simple loop over Array elements, each
> iteration calling Block.yield on a Block object passed in. Block
> aggregates the code body above with a Binding object holding the
> surrounding closed-over state.
>
>     public IRubyObject eachCommon(ThreadContext context, Block block) {
>         if (!block.isGiven()) {
>             throw context.getRuntime().newLocalJumpErrorNoBlock();
>         }
>         for (int i = 0; i < realLength; i++) {
>             // do not coarsen the "safe" catch, since it will
> misinterpret AIOOBE from the yielded code.
>             // See JRUBY-5434
>             block.yield(context, safeArrayRef(values, begin + i));
>         }
>         return this;
>     }
>
> Similar idioms certainly exist in any other JVM languages that allow
> passing functions or closures around.
>
> The problem here is that the "yield" call is always going to go
> megamorphic very quickly. There will be dozens of places in a typical
> Ruby (or Clojure, or Scala, or Groovy) application that use the same
> list-iteration logic, and they all pass through the same generic code.
> That means the best you can optimize is inlining the loop logic into
> the caller...the closure body won't inline and will have to optimize
> on its own.
>
> This also has implications for escape analysis. If there's any state
> (including the closure object itself) involved in doing that
> iteration, it's now impossible for it to EA away, since the closure
> can't inline all the way back into the caller. Any heap-based
> structures are now firmly on the heap, and add to our allocation
> overhead.
>
> Now, strategies...
>
> In general, what's needed is a way to specialize "each" for many
> different call sites and closures.
>
> The easiest for us implementers would be if the JVMs simply started to
> see through these calls. Closure/function-receiving code is going to
> become more and more common, and indeed was already rather common for
> event-handling systems and the like. Rémi and I talked with Fredrik
> (JRockit) two JVMLSs ago about how JRockit might be able to optimize
> these cases. Fredrik believed it would be possible, but that some sort
> of marker was needed on the "each" method to show JRockit that it's
> code that calls code. I forget who suggested it, but we decided an
> easy marker would be to have the signature receive a MethodHandle or
> subtype. JRockit (and perhaps other JVMs) could use that as indication
> that this method should be specialized to the caller and provided
> closure.
>
> Barring JVM help, we are likely to attempt to optimize this case in
> JRuby directly. My strategy will be to see (via runtime profiling)
> that particular closure-receiving methods are hot, and do the
> specialization myself. This would boil down to having JRuby's JIT emit
> both the caller's class body *and* a copy of the "each" body into the
> compiled result. The caller, each, and the closure (also in the same
> result) would lie along a monomorphic path, allowing all three to
> inline together. This will be easy with Ruby code; if I see a
> closure-passing call to another Ruby method, I emit a body for that
> method too. For core JRuby methods, which are implemented in Java, it
> will be trickier; I'll need to either have simple ways to duplicate
> those methods in-place or I'll need to move them into Ruby.
> Ironically, we may be seeing the beginning of an age where it's faster
> to implement JRuby core classes in Ruby.
>
> Have any of you other implementers thought about this? What are you
> considering as strategies for optimizing closure invocations?
>
> - Charlie

-- 
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en.

[jvm-l] Re: Strategies for optimizing closure invocations

Reply via email to