Thanks for the update, John! Comments below...

On Wed, May 9, 2012 at 2:34 PM, John Rose <john.r.r...@oracle.com> wrote:
> In JDK 7 FCS a method handle is represented as a chain of argument 
> transformation blocks, ending in a pointer to a methodOop.  The argument 
> transformations are assembly coded and work in the interpreter stack.  The 
> reason this is not outrageously slow is that we vigorously inline method 
> handle calls whenever we can.  But there is a performance cliff you can drop 
> off of, when you are working with non-constant MHs.  (BTW, invokedynamic 
> almost always inlines its target.)  Project Lambda needs us not to drop off 
> of this cliff.

And I need you to not drop off that cliff too! It's very easy to
trigger...just make a method big enough, and AAAAAAAARRGH into the pit
you go.

Luckily, for the ambitious early-access JRuby users running JRuby
master + Java 7u2+ in production, the code they're hitting is all
small enough to avoid the cliff, but with JRuby 1.7 preview release
coming out in a couple weeks more people are going to start trying
things out.

> To fix this, we are now representing the argument transformations using a 
> simple AST-like IR, called a LambdaForm.  This form can be easily rendered 
> down to bytecodes.  (Eventually it maybe directly rendered to native code.)  
> The form is *also* interpretable by a Java-coded AST walker.  This allows the 
> system to be lazy, and to work hardest on optimizing those method handles 
> that are actually called frequently.  The laziness also helps simplify 
> bootstrapping.  The remaining assembly code is much smaller, and can be 
> mirrored in the JIT IR and optimized.

It also creates some *epic* stack traces when it blows up. Will those
fold away in the future?

> Here's an update on where we are.  Christian Thalinger, Michel Haupt, and I 
> are currently working on the following tasks:
>
> A. clean out the compiled method calling path, for non-constant method handles
> B. flatten the BMH layout (no boxing, linked lists, or arrays)
> C. make the handling of MethodType checking visible to the compiler (removing 
> more assembly code)
> D. tuning reuse and compilation of LambdaForm instances
> E. profiling MH.LambdaForm values at MH call sites
> F. tuning optimization of call sites involving LFs

I have been tossing numbers and benchmarks back and forth with
Christian, and now testing a local build of the meth-lazy stuff
myself. Numbers haven't been great, but I think Christian made great
progress today (based on an email showing C1 + indy beating C1 without
indy and drastically beating C1 + indy in a stock u6 build that falls
off the cliff). It's very exciting!

> For A. the remaining snag is getting the argument register assignments 
> correct for the call to the target method.  There is also an issue with 
> representing non-nominal calls in the backend.

I assume this is the problem Christian described to me, where it was
calling back into the interpreter to fix up the arguments?

> For B. we are currently working on bootstrap issues.  The idea here is that, 
> while we can do escape analysis, etc., a cleaner data structure will make the 
> compiler succeed more often.

I will be *thrilled* when EA works across indy call sites. We have
started work on our new compiler, which uses a simpler intermediate
representation and which will be indy-only from day 1. Already we're
seeing gains since we don't have to hand-write all the different call
paths we want to represent; we can wire up any combinations of
arguments, handles, and target using only method handles. That means
we do things that will be ripe for EA like:

* Allocating heap storage for closures right next to the closure creation
* Passing closures as a handle rather than as an opaque, polymorphic structure
* Specializing closure-receiving code in *our* compiler until Hotspot
can specialize it for us

I'd be very surprised if we can't approach Java performance for the
*general* cases of Ruby code by end of year, and if we can specialize
closure-receiving code *and* get EA, we might be able to compete with
Java 8 lambda performance for Ruby's closures too.

We also have our own profiling, inlining, and so on...but that's all
above the level of bytecode to work around as-yet-unoptimized patterns
in Hotspot. :)

> For C. we have a refactoring in process for moving the MT value out of the 
> methodOop.
>
> Chris, Michael, and I are working on A, B, C, respectively.  We think a first 
> cut of lazy MHs needs the first three items in order to be reasonably faster 
> than the all-assembly implementation of JDK 7.
>
> In order to address the infamous NoClassDefFound error, we are minimizing 
> nominal information in MH adapter code (LambdaForms and their bytecode).  
> Only names on the BCP will be in adapter code.   Part C. is an important part 
> of this, since it allows the system to internally "flatten" calls like 
> MH.invokeExact((MyFunnyType)x) to MH.invokeExact((Object)x).  The new 
> internal MH methods (invokeBasic, invokeStatic, etc.) all use "denominalized" 
> types, which is to say that all reference types are represented as 
> java.lang.Object.

I have not been able to stump Chris with any NCDFEs lately, so that's
good. But I do have some hacks in place to prevent them I can't remove
until the new logic solidifies a bit.

Now that the logic has started to land, I'm going to do some
benchmarking and assembly-reading of my own to help from my end. And
hopefully there's a chance I'll be able to help more directly over the
summer.

Very exciting stuff...I'm thrilled that dynlangs and indy are being
taken so seriously. I told a couple thousand people at JAX 2012 how
strongly I believe that indy is the most important work happening on
the JVM right now, and I'm looking forward to doing more and more with
it :)

- Charlie
_______________________________________________
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

Reply via email to