Re: The Great Startup Problem

Vladimir Ivanov Tue, 02 Sep 2014 10:55:03 -0700

Charlie,

Is it acceptable and solves the problem for you?


This is acceptable for JRuby. Our worst-case Ruby method handle chain
will include at most:

* Two CatchExceptions for pre/post logic (heap frames, etc). Perf of
CatchException compared to literal Java try/catch is important here.
* Up to two permute arguments for differing call site/target argument ordering.
* Varargs negotiation (may be a couple handles)
* GWT
* SwitchPoint
* For Ruby to Java calls, each argument plus the return value must be
filtered to convert to/from Ruby types or apply an IRubyObject wrapper

This is worst case, mind you. Most calls in the system will be
arity-matched, eliminating the permutes. Most calls will be three or
fewer arguments, eliminating varargs. Many calls will be optimized to
no longer need a heap frame, eliminating the try/finally. The absolute
minimum for any call would be SwitchPoint plus GWT.

Of course I'm not counting DMHs here, since they're either the call we
want to make or they're leaf logic.

Thanks for the data! That's good!

We discussed an idea to generate custom bytecodes (single method) for the
whole method handle chain (and have only 1 extra stack frame per MH
invocation), but it defeats memory footprint reduction we are trying to
archieve with LambdaForm sharing.


Funny thing...because indy slows our startup and increases our warmup
time, we're using our old binding logic by default. And surprise
surprise, our old binding logic does exactly this...one small
generated invoker class per method. I'm sure you're right that this
approach defeats the sharing and memory reduction we'd like to see
from LFs, but it works *really* well if you're ok with the extra class
and metaspace data in memory.

I see one problem with pre-compiling method handle trees.

Every tree should be compiled as a whole, so fast path and slow path arealways compiled. Without explicit hints or profiling and recompilationit's impossible to distinguish them.

Comparing with MethodHandle/LambdaForm compilation unit, where slow pathusually stays interpreted on LF level (due to invocation threshold), forconsiderably large method handle trees memory overhead can be larger.

But I'm just guessing here - I don't have any statistics yet neither onaverage size of method handle trees nor numbers on memory overheadinduced by individual classes.

So there's one question: is the cost of a bytecoded adapter shim for
each method object really that high? Yes, if you're spinning new MHs
constantly or doing a million different adaptations of a given method.
But if you're just lazily creating an invoker shim once per method,
that really doesn't seem like a big deal.

Good question. I have a prototype of LF inlining during bytecodetranslation. I'll conduct some experiments to gather some data.

My indy binding logic also has a dozen different flags for tweaking. I
can easily modify it to avoid doing all that pre/post logic and
argument permutation in the MH chain and just bind directly to the
generated invoker. Best (or worst) of both worlds? I just really don't
want to have to do that...I want everything from call site to target
method body to be in the MH chain.

For JRuby 9000, all try/finally logic will be within the target
method, so at least that part of the MH chain goes away.

Here's another idea...

We've been using my InvokeBinder library heavily in JRuby. It provides
a Java API/DSL for creating MH chains lazily from the top down:

MethodHandle mh = Binder.from(String.class, Object.class, Float.class)
         .tryFinally(finallyLogic)
         .permute(1, 0)
         .append("Hello")
         .drop(1)
         .invokeStatic(MyClass.class, "someMethod");

The adaptations are gathered within the Binder instance, playing
forward as you add adaptations and played backward at binding time to
make the appropriate MethodHandles and MethodHandle calls.

Duncan talked about how he was able to improve MH chain size and
performance by applying certain transformations in a different order,
among other things. InvokeBinder *could* be doing a lot more to
optimize the MH chain. For example, the above case never uses the
Object value passed in (it is permuted to position 1 and later
dropped), but that fact is obscured by the intervening append.

InvokeBinder is basically doing with MHs what MHs do with LFs. Perhaps
what we really need is a more holistic view of MH + LF operations
*together* so we can boil the whole thing down (even across MH lines)
before we start interpreting or compiling it?

The idea of rearranging method handles looks interesting. If JSR292framework treated some method handle chains specifically (like havingcustom LambdaForm shape for nested guards), it would be beneficial tofavor such shapes in the binder.


Best regards,
Vladimir Ivanov


- Charlie
_______________________________________________
mlvm-dev mailing list
[email protected]
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

_______________________________________________
mlvm-dev mailing list
[email protected]
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

Re: The Great Startup Problem

Reply via email to