On Jun 3, 2011, at 4:15 PM, Tom Rodriguez wrote: > On Jun 2, 2011, at 7:37 PM, John Rose wrote: > >> Thanks; I'll look at your dump later tonight. >> >> If the problem is friction from interface casts, we can probably remove >> them. It's hard to figure out how they are getting in, though. It happens >> when IRubyObject interconverts with Object. > > So I put in a little hack to fold repeated interface checkcasts and that gets > back a lot of the performance. With fib on my machine dynopt=true reports > 1.005000, invokedynamic=true reports 1.293000 and turning on my checkcast > hack gets it down to 1.112000. Unfortunately what I've got right now isn't > really suitable for inclusion in the JDK7. > > John, I noticed that it looks like MethodHandleWalk is injecting them for > return values, thought it's somewhat inconsistent. For instance, I see this: > > // FIXME: consider inlining the invokee at the bytecode level > > ArgToken ret = make_invoke(methodOop(invoker), vmIntrinsics::_none, > Bytecodes::_invokevirtual, false, 1+argc, > &arglist[0], CHECK_(empty)); > DEBUG_ONLY(invoker = NULL); > if (rtype == T_OBJECT) { > klassOop rklass = java_lang_Class::as_klassOop( > java_lang_invoke_MethodType::rtype(recursive_mtype()) ); > if (rklass != SystemDictionary::Object_klass() && > !Klass::cast(rklass)->is_interface()) { > // preserve type safety > > ret = make_conversion(T_OBJECT, rklass, Bytecodes::_checkcast, > ret, CHECK_(empty)); > } > } > > but down in make_invoke itself we do this: > > switch (_rtype) { > case T_BOOLEAN: case T_BYTE: case T_CHAR: case T_SHORT: > case T_INT: emit_bc(Bytecodes::_ireturn); break; > case T_LONG: emit_bc(Bytecodes::_lreturn); break; > case T_FLOAT: emit_bc(Bytecodes::_freturn); break; > case T_DOUBLE: emit_bc(Bytecodes::_dreturn); break; > case T_VOID: emit_bc(Bytecodes::_return); break; > case T_OBJECT: > if (_rklass.not_null() && _rklass() != SystemDictionary::Object_klass()) > emit_bc(Bytecodes::_checkcast, cpool_klass_put(_rklass())); > emit_bc(Bytecodes::_areturn); > > This results in adapter bytecodes that look like this: > > 0 aload_1 > 1 aload #4 > 3 aload #5 > 5 aload_2 > 6 aload #6 > 8 invokevirtual 7 > <org/jruby/internal/runtime/methods/DynamicMethod.call(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/RubyModule;Ljava/lang/String;)Lorg/jruby/runtime/builtin/IRubyObject;> > 0 bci: 8 VirtualCallData count(10000) entries(0) > 11 checkcast 8 <org/jruby/runtime/builtin/IRubyObject> > 24 bci: 11 ReceiverTypeData count(10000) entries(0) > 14 areturn > > which seems fairly pointless.
Yes, those are pointless and should be removed (with an extra !is_interface guard). > These don't seem to be the source of the checkcasts in jruby though. They > seem to be explicitly part of the method handle chain. For this chain: > > 0xeff0d808: adapter: arg_slot 0 conversion op check_cast (LLLLL)L > 0xeff0d7a8: adapter: arg_slot 1 conversion op check_cast (LLLLL)L > 0xeff0d748: adapter: arg_slot 2 conversion op check_cast (LLLLL)L > 0xeff0d6e8: adapter: arg_slot 3 conversion op check_cast (LLLLL)L > 0xeff0d688: adapter: arg_slot 4 conversion op check_cast (LLLLL)L > 0xeff0d2b8: adapter: arg_slot 1 conversion op drop_args pushes -1 (LLLLL)L > 0xeff0d1a8: adapter: arg_slot 2 conversion op drop_args pushes -1 (LLLL)L > 0xeff0acd8: bound: arg_type object arg_slot 0 instance > org.jruby.runtime.Block (LLL)L > 0xeff0ac68: bound: arg_type object arg_slot 4 instance > bench.bench_fib_recursive (LLLL)L > > we produce these bytecodes: > > 0 aload #5 > 2 checkcast 3 <org/jruby/runtime/builtin/IRubyObject> > 0 bci: 2 ReceiverTypeData count(31244) entries(0) > 5 astore #5 > 7 aload #4 > 9 checkcast 4 <java/lang/String> > 24 bci: 9 ReceiverTypeData count(31244) entries(0) > 12 astore #4 > 14 aload_3 > 15 checkcast 5 <org/jruby/runtime/builtin/IRubyObject> > 48 bci: 15 ReceiverTypeData count(31244) entries(0) > 18 astore_3 > 19 aload_2 > 20 checkcast 6 <org/jruby/runtime/builtin/IRubyObject> > 72 bci: 20 ReceiverTypeData count(31244) entries(0) > 23 astore_2 > 24 aload_1 > 25 checkcast 7 <org/jruby/runtime/ThreadContext> > 96 bci: 25 ReceiverTypeData count(31244) entries(0) > 28 astore_1 > 29 ldc <Object> 0xefe59f88 > 31 aload_1 > 32 aload_3 > 33 aload #5 > 35 ldc <Object> 0xefabd418 > 37 invokestatic 14 > <bench/bench_fib_recursive.method__0$RUBY$fib_ruby(Lbench/bench_fib_recursive;Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runti\ > me/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/Block;)Lorg/jruby/runtime/builtin/IRubyObject;> > 120 bci: 37 CounterData count(31244) > 40 areturn That looks like somebody did this: MethodHandle inner = #method__0$RUBY$fib_ruby; MethodHandle outer = inner.asType(inner.type().generic()); In other words, wrapped a moderately typeful method in an erased method type of all-Object. The big question is who built that chain. One big answer is that pre-RF code was building such things routinely, in order to normalize signatures down to a few equivalence classes (arity only). But post-RF code doesn't need to do that. I found a few places in MethodHandleImpl.java where needless asType calls were issued in order to normalize signatures. I changed those to an internal equivalent to explicitCastArgs, and pushed it. > Just blindly skipping checkcast method handles for interface types bring the > time on fib down to 1.071000. That's promising. -- John _______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev